For the last few years, high-performance data centers have been moving aggressively towards parallel technologies, such as clustered computing and multi-core processors, which accelerated development and deployment of parallel applications.
While this increased use of parallelism solves the vast majority of computational bottlenecks, it shifted the performance bottlenecks to the storage I/O system. With mainstream computing going parallel, the storage subsystem needs to migrate to parallel technology. To become ubiquitous, a standard approach which allows choices from multiple storage vendors and the freedom to access parallel storage from any client is required.
To move to the next level of performance, storage systems must be optimized for parallelism, while adhering to an economically efficient standard. NFS, the current network file system standard, doesn’t support parallel I/O and existing parallel products from the key storage vendors are not compatible with each other. Until the industry delivers a parallel storage standard, user adoption will continue to be hampered by their reluctance to deploy one of the many incompatible parallel storage implementations.
Later this year, the Internet Engineering Task Force (IETF) NFSv4 subcommittee is expected to conclude its work on the Parallel NFS (pNFS) protocol that is part of the NFS version 4.1 RFC. This milestone will move NFS version 4.1 from Internet-Draft to a proposed standard. Parallel NFS enables direct parallel data transfer between clients and storage devices without the need for expensive filer heads. Support is expected for Linux, Windows, and the leading UNIX versions, such as Solaris and AIX.
This new standard is being developed by a consortium of storage industry technology leaders, including Panasas, IBM, EMC, Network Appliance, Sun Microsystems, and University of Michigan’s Center for Information Technology Integration (CITI).
The Challenge with NFS in Today’s World
In order to understand how pNFS works it is first necessary to understand what goes on in a typical NFS architecture when a client attempts to access a file.
Figure 1 shows a traditional NFS architecture. You can see that the NFS server sits between the client computer and the actual physical storage devices. When the client wants to access files residing on that storage it must create a link to the NFS server (known as creating a mount point.) When the client attempts to access files the NFS server acts as an intermediary and manages all of the data processing required to deliver data back to the client requesting it.
This architecture works well for relatively small data sets being accessed by a few clients and provides significant benefits over Direct Attached Storage (like the disk in your pc); namely that data can be shared by multiple clients and accessed by any client that has NFS capabilities. However if large numbers of clients need access to the data or the data set grows too large then the NFS server quickly becomes a bottleneck and chokes the system performance. Fundamentally pNFS removes that bottleneck allowing incredibly fast access to very large data sets from many many clients.
pNFS Eliminates the Bottleneck
Here we see how pNFS modifies the NFS architecture to eliminate the bottleneck we just described. Essentially the NFS server is moved ‘out-of-band’ and becomes what is known as a metadata server. That means it manages data about data. So when a client needs to access data what does it do?
The first thing it does is talk to the NFS server just as it did in the previous example. However this time the server provides the client with a map of where to find the data and credentials regarding its rights to read/modify/write the data. Once the client has those two components, it talks directly to the storage devices when accessing the data. With traditional NFS every bit of data flows through the NFS server – with pNFS the NFS server is removed from the primary data path allowing free and fast access to data. Of course all the advantages of NFS are maintained but now there is no bottleneck and data can be accessed in parallel allowing for very fast throughput rates and system capacity can be easily scaled without impacting overall performance.