Comparison
of Cluster File Systems
Csaba Gere
research associate
Department of Internet Technologies and
Applications
MTA-SZTAKI
H-1132 Budapest, Victor Hugo u. 18-22.
telefon: (+36 1) 2796027
email: gcsaba@sztaki.hu
Péter Stefán
research associate
Supercomputing Centre
NIIFI
H-1132 Budapest, Victor Hugo u. 18-22.
telefon: (+36 1) 4503076
email: stefan@niif.hu
Abstract
Implementing Internet services into
fault-tolerant environments has been getting more and more popularity nowadays.
The goal of implementing fault-tolerant services is to let the availability of
the specific service above a well-defined threshold value, such as 99.9% by
installing it on a redundant, distributed system architecture which enable
services to run at reduced performance even under extreme conditions such as
hardware or software failure. There are numerous ways and levels of
implementing fault tolerant services. The undermost level of this is to use fault-tolerant
cluster file systems.
Cluster file systems were developed on the grounds
of Network File System (NFS), since the continuously growing demand revealed its
many shortcomings, and required more new features.
The key requirements that a cluster file system
should meet is as follows: fault tolerant behavior (handling distributed data
and failover), load leveling, utilization of high network bandwidth, scalability
and effective resource utilization (addition/removal of disk areas, merging,
striping and mirroring).
Cluster file systems can be used in three
fundamental ways: as a local file system, such as the ordinary Unix File Sytem
(UFS) or Reiser File Sytem (ReiserFS); as a network file system with
improved NFS functionality; as an integrated Storage Area Network (SAN).
Using cluster file systems as a local file
system is very rare, since most of their relevant features cannot be, or can be
restrictedly utilized, furthermore, at reduced performance.
Using cluster file systems as network file systems
means that there are specific machines in the distributed environment which exclusively
access the storage area, and provide file service to the other machines via
common network (e.g. an Internet Protocol network). In this layout there is a
master server, which provides the file service under normal operation, and
several backup servers which can take over the role of the master if it fails
transparent to the clients. The important feature of such setup is the
appropriate “failover” and “failback” of the file service.
In the case of SAN architecture all nodes of
the cluster access the same disk area via a high-speed Fiber Channel Arbitrated
Loop (FC-AL) or a Fiber Channel Switch (FC-SW). In this setup, the task of the
file system is to provide efficient read and write locking mechanisms to let the
large throughput effectively utilized.
In our presentation we are going to describe
our experiences on the EMC Storage installed at NIIFI how fault-tolerant file systems
can be created. We are going to provide configuration examples and performance
analysis results on two cluster file systems: Sistina’s GFS and IBM’s GPFS.