You are here

Sangria, hideous names, readahead

My day at work alternated between trying to put out a new release of my kernel patches (complicated by a failing machine at citi making some of our services unreliable), working on various ACL-related stuff (mainly talking over things with Andreas and then trying to read through some of his code), and helping Fred with some debugging (a little; Fred was the one that actually figured it out).

Some days a problem really grabs me and I just can't stop thinking about it; this wasn't such a day.

After work I met Trond, Laura, Sara, and (eventually) Paul at Dominicks. Trond suggested the nfsd readahead problem as the sort of thing that should keep a person up at night.

In particular, the problem is this: every time an application asks for data from a file, you have to go read the data from disk. But that's terribly inefficient. It takes eons (ok, milliseconds--but that's eons on today's hardware, where a processor executes an instruction every nanosecond) to move the disk head to the right place, wait for the disk to spin around to the right spot, and read the data. So, ideally, you'd like the data to already have been read. How is that possible? Well, it's not always possible for the operating system to predict what's going to be asked for next. But often it is, because often applications just read through whole files sequentially from beginning to end.

So any modern operating system recognizes when an application is reading straight through a file from beginning to end, and starts anticipating by performing "readahead"--reading the next few chunks of the file before they're requested, assuming that they'll be needed soon.

The problem comes when you throw NFS into the mix--now the application is split from the disk by a network, and the read requests may sometimes be switched around and arrive in a different order. So even though the application is reading through the file in order, it looks to the NFS server like they're going back and forth a little, and the standard readahead algorithm fails because it doesn't recognize that this is basically not that different from sequential reads.

I actually read a paper recently where they dealt with this problem, but can't for the life of me remember where it was....

When I got home I tried reading through Pike and Weinberger's "the Hideous Name", but found it kind of a pointless paper.