You are here

babies, bicycling, journaling

My friend and co-worker Fred was back to work today after three weeks of paternity leave. He looked alert enough, and says he's actually getting 6 hours of sleep a night. I seem to be of an age when everyone I know has a baby.

I rode my bike home from work. One of the pair of cheap halogen headlights I use was dimming, though I'd recharged all the batteries just a couple days ago. I suspect a problem with the charger. It's the time of year when I'm always returning home in the dark.

I have some fairly tedious debugging to do which mostly involves waiting for compiles, so figured I"d stay up a little late and do some reading.

Stephen Tweedie's "Journaling the ext2fs Filesystem" talks about the design of ext3. It's kind of suprising, but it turns out that one of the most important features of a filesystem is the way it recovers from crashes. Keeping your files safe, after all, is probably the most important thing your computer does--a computer that regularly loses your files, or, possibly worse, silently corrupts them in ways that you don't notice till long after the fact, can be worse than useless. And crashes are hard to avoid--even if the software is perfect, there are always power outages.

So the filesystem has to be updated in a way that allows the computer to figure out what happened even if it last died halfway through an update. Also, the computer has to be able to figure out what happened *fast*. The traditional filesystem repair process can require reading through an entire filesystem, and disk sizes are growing much more quickly than disk bandwidth, to the point where reading through a large filesystem can take hours.

The approach taken by ext3 is a fairly common solution called journaling; the filesystem appends a record of each update to a special file called a journal. After it has recorded the update to the journal, it appends some kind of special "commit" marker to the journal, and only then does it go mess with the actual file system.

Once the actual filesystem has been modified, the OS is free to remove the relevant entries from the journal.

Then on reboot the filesystem just reads through the journal, performs any updates that are marked as committed (in case they weren't previously finished), and removes any entries that aren't marked as committed. (Those updates will then be lost, but that's better than possibly making a mistake and leaving the filesystem in an inconsistent state.) Recovery is fast as long as you keep the size of the journal down.

Journaling also made it easy for the linux developers to maintain backwards compatibility with ext2--it's still possible to mount an ext3 filesystem as an ext2 filesystem by just ignoring the journal.

They note one interesting connection with NFS: NFS performance is often bound by the amount of time it takes the server to synchronously commit updates to disk. With some tuning the journalling may help by turning most such commits into sequential writes to disk?