Sonntag, 25. April 2010

Log2

It appears as if there will be a log2 filesystem fairly soon. The reason is compression, or rather some of the problems it created. In short, log2 will roughly double the random read performance for uncompressed data and speed up erases by a factor of 256 or thereabout.

Erases in normal filesystems are a fairly simple thing. Pseudocode would look roughly like this:
for each block in file {
free bytes
}


Once you add compression, things get a little more complicated. You no longer know the blocksize:
for each block in file {
figure out how many bytes the compressed block fills
free those bytes
}


And with the current logfs format, block size is part of a header prepended to each block. We only need two bytes of that header. But reading from the device always happens in a granularity of 4096 bytes, so effectively we have to read 4096 bytes per deleted block. And the result doesn't feel like a greased weasel on caffeine.

So the solution will be to add those two bytes, and a couple of other fields from the header, to the block pointer in the indirect block. The whole block pointer will be 16 bytes, thus explaining the 256x improvement.

The random read problem is - again - caused by the 4096 byte granularity. With compression, data block will often span two 4096 byte pages. Uncompressed data will always do so, and rarely span three if you include the header. So reading a random block from a file usually requires reading two pages from the device. Bummer.

Solution is simple, align the data. One problem with aligned data is where to find the header. But since we just solved that problem two paragraphs up, things should be relatively simple. We shall see.

So why create a new filesystem and not just change the existing logfs? Well, mainly to prevent bugs that new and intrusive code always brings from interfering with people's existing filesystems. The ext family has set an interesting precedent in this respect.

1 Kommentar:

  1. Wow, it could be nice thing. If real life will show that erasing indeed slows down logfs, then "logfs2" will indeed need to be done. Good point in forking to new version!

    Compression, and in place execution would be definietly good also.

    Thank for very good work! I'm testing logfs on my SDHC 16GB.

    AntwortenLöschen