dbms-notes: writing blocks to disk: Network File System (NFS)

What is NFS

NFS is a platform independent remote file system technology created by SUN in the 1980s.
It is a client/server application that provides shared file storage for clients across a network.
It was designed to simplify the sharing of filesystems resources in a network of non-homogeneous machines.
It is implemented using the RPC protocol and the files are available through the network via a Virtual File System (VFS), an interface that runs on top of the TCP/IP layer.
Allows an application to access files on remote hosts in the same way it access local files.

NFS Servers: Computers that share files

During the late 1980s and 1990s, a common configuration was to configure a powerful workstation with lots of local disks and often without a graphical display to be a NFS Server.
"Thin," diskless workstations would then mount the remote file systems provided by the NFS Servers and transparently use them as if they were local files.

NFS Simplifies management:

Instead of duplicating common directories such as /usr/local on every system, NFS provides a single copy of the directory that is shared by all systems on the network.
Simplify backup procedures - Instead of setting up backup for the local contents of each workstation (of /home for exmaple), with NFS a sysadm needs to backup only the server's disks.

NFS Clients: Computers that access shared files

NFS uses a mixture of kernel support and user-space daemons on the client side.
Multiple clients can mount the same remote file system so that users can share files.
Mounting can be done at boot time. (i.e. /home could be a shared directory mounted by each client when user logs in).
An NFS client

(a) mounts a remore file system onto the client's local file system name space and
(b) provides an interface so that access to the files in the remote file system is done as if they were local files.

----
Goals of NFS design:

NSF Versions

NFS design: NFS Protocol, Server, Client

NFS Protocol

Uses Remote Procedure Call (RPC) mechanisms
RPCs are synchronous (client application blocks while waits for the server response)
NFS uses a stateless protocol (server do not keep track of past requests) - This simplify crash recovery. All that is needed to resubmit the last request.
In this way, the client cannot differentiate between a server that crashed and recovered and one that is just slow.

New File system interface

The original Unix file system interface was modified in order to implement NFS as an extension of the Unix file system.
NFS was built into the Unix kernel by separating generic file systems operations from specific implementations. With this the kernel can treat all filesystems and nodes in the same way and new file systems can be added to the kernel easily:

A Virtual File System (VFS) interface: defines the operations that can be done on a filesystem.
A Virtual node (vnode) interface: defines the operations that can be done on a file within a filesystem.

A vnode is a logical structure that abstracts whether a file or directory is implemented by a local or a remote file system. In this sense, applications had to "see" only the vnode interface and the actual location of the file (local or remote file system) is irrelevant for the application.
In addition, this interface allows a computer to transparently access locally different types of file systems (i.e. ext2, ext3, Reiserfs, msdos, proc, etc).

NFS Client
Uses a mounter program. The mounter:

takes a remote file system identification host:path;
sends RPC to host and asks for (1) a file handle for path and (2) server network address.
marks the mount point in the local file system as a remote file system associated with host address:path pair.

Diagram of NFS architecture

NFS Remote Procedure Calls
NFS client users RPCs to implement each file system operation.
Consider the user program code below:

fd <- OPEN ("f", READONLY)
READ (fd, buf, n)
CLOSE (fd)

An application opens file "f" sends a read request and close the file.
The file "f" is a remote file, but this information is irrelevant for the application.
The virtual file system holds a map with host address and file handles (dirfh) of all the mounted remote file systems.
The sequence of steps to obtain the file are listed below:

The Virtual File System finds that file "f" is on a remote file system, and passes the request to the NFS client.
The NFS client sends a lookup request (LOOKUP(dirth, "f") for the NFS Server, passing the file handler (dirth) for the remote file system and file name to be read.
The NFS server receives LOOKUP request, extracts the file system identifier and inode number from dirth, and asks the identified file system to look up the inode number in dirth and find the local directory inode information.
The NFS server searches the directory identified by the inode number for file "f".
If file is found, the server creates a handle for "f" and sends it back to the client.
The NFS client allocates the first unused entry in the program's file descriptor table, stores a reference to f's file handle in that entry, and returns the index for the entry (fd) to the user program.
Next, the user program calls READ(fd, buf, n).
The NFS client sends the RPC READ(fh,0,n).
The NFS server looks up the inode for fh, reads the data and send it in a reply message.
When the user program calls to close the file (CLOSE(fd)), the NFS client does not issue an RPC, since the program did not modify the file.

Configuring NFS on Ubuntu

References:
Russel Sandberg, David Goldberg, Steve Kleiman, Dan Walsh, and Bob Lyon. Design and Implementation of the Sun Network Filesystem . Proceedings of the Summer 1985 USENIX Conference, Portland OR, June 1985, pp. 119-130.
Saltzer, Jerome H. and M. Frans Kaashoek. 2009. Principles of computer system design.

Pages