You are here

"A Fast File System for UNIX"

I ran across Filesystems Reading List recently and thought I should skim through some of the references.

McKusick, Joy, Leffler, and Fabry, "A Fast File System for UNIX" describes improvements to "the" UNIX file system, including:

  • increasing the minimum block size to 4096, to make it possible to address larger files without adding more indirection, and to reduce seeks (and overhead of data transfers?),
  • adding redundant copies of the superblock (which never changes) to aid recovery
  • representing free/allocated blocks using bitmaps (one for each "cyclinder group") instead of just a single list of free blocks, to allow smarter allocations that manage fragmentation,
  • adding the rule reserving the last few percent of free space to the system administrator, mainly to prevent fragmentation,
  • adding more complex allocation policies designed to group inodes from the same directory together, and file data for the same inode together, to reduce seeks.

To prevent wasting space on (typical) filesystems with lots of small files, they actually allocate space in units of fragments, with size set to some fixed fraction of the block size. To keep down fragmentation (I guess) they'll allocate a new block and copy a bunch of fragments to it if a new write would create enough fragments to fill a block. But this being inefficient, they discourage it by introducing what I guess must be the st_blksize state field.

The basic motivation for all of this was to increase reliability and performance (the old filesystem was using only a few percent of the theoretically available bandwidth to storage).

They were limited by disk drivers that would only transfer one block at a time; so reading two contiguous blocks ends up not being as fast as expected, since by the time you're ready to read the second you've already spun past it. So they have some bizarre allocation policies that try to locate blocks that are spaced at the right intervals around the platter so that you don't overshoot them when you do a big sequential read.

They also introduce long file names (and discuss the directory layout), symlinks, "flock" locks, and quotas.

They also add the rename operation, motivating by desire to atomically replace a file by a new version of the file. If that has to be done by removing the old file, linking in the new file at the old file's location, then unlinking the new file's temporary location, then there are times when the neither the old nor new version is visible at the target location, etc. So "rename" is added to do all of this atomically.

The paper is from 1984, so I would have been in seventh grade.