[Gllug] Loopback mountable image file compression.

Phillip Lougher phillip.lougher at gmail.com
Wed Aug 26 05:13:00 UTC 2009


On Tue, Aug 25, 2009 at 1:44 PM, Richard Jones<rich at annexia.org> wrote:
>
> So you really are seriously suggesting using a squashfs filesystem
> containing a single file (the ext3 or NTFS image):
>

Yes I am.

>  /fs.img
>
> because squashfs can compress.  Just because squashfs can do this
> doesn't make it the right way to do it -- block devices should be
> compressed at the block device level, not wrapped in an unnecessary
> filesystem layer.
>

Why?  Aesthetic reasons? The distinction is largely artificial.   In
practice you should choose whatever is the most convenient and which
gives the best performance.  Squashfs has many advantages over the
existing block device level compression schemes.

1. Performance.  Cloop performs worse than Squashfs even in this case
because of the mismatch between compression block size and page size.
  Typical page sizes are 4K, typical compressed block sizes are 128k.
When a request for a 4K page is delivered to either cloop (via  the
block device interface)  or Squashfs (via the file readpage
interface), the much larger 128K block containing it has to be
decompressed.  The 4K page can be returned, but what happens to the
other 124K?  Cloop discards it, Squashfs pushes the extra data into
the page cache (making use of the associated file address space).
Subsequent accesses of these pages will be satisfied from the page
cache using Squashfs. whereas cloop will often repeatedly
re-decompress the data.

2. Compression size.  Even here Squashfs has better compression
because Squashfs compresses the metadata, cloop does not.  The extra
filesystem overhead (filename and attributes) are trivial compared to
this.

3. Sparse block handling.  Squashfs detects holes (zero filled ranges)
and handles them specially.  Cloop compresses zero filled blocks,
leading to worse performance and compression.  Filesystem images often
have very large ranges of zero filled blocks (unused blocks).

4.Squashfs is mainlined and available with almost every distribution,
cloop and other block device level compression schemes are not.
There's a lot to be said for convenience.

So, performance, compression, sparse block handling and convenience.
If that doesn't make it the "right way to do it" what does?

> We just switched _from_ using squashfs _to_ using ISO9660 for the
> libguestfs test harness[1].  I'll outline the reasons why below.
>

I'm well aware you used to use Squashfs in libguestfs and recently
switched to using iso9660 for storing your test cases - I've looked at
your git project tree.  I'm also aware that you had to do this because
of incompatibility between Squashfs versions 3.x and 4.0.  Your
annoyance and resultant antipathy to Squashfs is obvious.

> I hope that these reasons can be resolved in the future or are being
> resolved now, because I'd really like to use squashfs.

I'm honoured.

>
> (1) Squashfs format changed incompatibly between version 3 & 4.
>
> This is a warning to the original poster: if you use squashfs, make
> sure you keep the tools you use to create and unpack squashfs around
> (as source and binaries) so that you can be sure you will be able to
> read images in future.
>

Yet more FUD.   I wrote Unsquashfs to guarantee that all Squashfs
filesystem versions will always be readable in the future,
irrespective of kernel support.  Unsquashfs supports every filesystem
version (1.x, 2.x, 3.x, 4.0) and will always do so.

You seem to be blaming myself for the version 3.x/4.0 version
incompatibility mess.  If you'd done your homework, you'd know that
the layout changes was forced on myself as a price for getting it
mainlined.    My out of mainline patches always had backwards
compatibility for earlier filesystem versions, this backwards
compatibility was also refused for mainline.

Your other points are TODOs, Squashfs is a self-financed project and
improvements invariably take time, and have to be prioritised.
Squashfs is unfortunately the only mainstream filesystem which doesn't
have company backing.

You obviously failed to notice the irony in complaining about version
incompatibility and then asking for filesystem improvements which
often require incompatible versions to be made.  I have kept a fine
line between keeping the filesystem version stable and adding new
features when necessary, this has worked well for the last couple of
years, and it is only recently that the 3.x/4.0 issue has arisen, and
as I previously said, this wasn't my choice.

Cheers

Phillip
-- 
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug




More information about the GLLUG mailing list