[Gllug] Reiserfs faster boot

Nick Hill nickhill at email.com
Mon Oct 8 12:14:36 UTC 2001


On 07 Oct 2001 00:15:34 +0100
"Darran D. Rimron-Molloy" <ddrm at digital-science.net> wrote:

> When you create a file that contains the same character repeated (ed,
> cat /dev/zero into a file - say a couple of megs worth of 00000's ) it
> won't take a couple of meg, it takes only a few bytes. That's a hole in
> the file-system.
> 
> But cat'ing your newly created file still returns multi-millions of 0's
> - or whatever.
> 
> Then again, how often do you create a hole in a file-system?
> 
> Or have I got this wrong - atleast that's what I thought a hole in the
> file-system was....
I think you are assuming file systems compress data stored on drives. Most filesystems do not compress data. They just write the data, organised as filenames and trees, in a logical block fashion.

You can test this by:
df
Note number of free blocks
dd if=/dev/urandom of=randomfile bs=1M count=200
df Note number of free blocks
dd if=/dev/zero of=zerofile bs=1M count=200
df
Note number of free blocks. 
ls -l
notice the files are the same size: 209,715,200.

You will notice that whether the file has been written with all zeros or a random string, the same amount of filesystem space has been taken.
In this case, mine was 205,004 blocks.

However, if you compress the files:
gzip randomfile
gzip zerofile
ls -l
you will notice the file with all zeros has been compressed to 1/1000th of it's original size.
File sizes:
randomfile.gz: 209,749,311
zerofile.gz:203,560

as you can see, the 'compressed' random file is actually larger than the uncompressed random file by 34K!

The file containing zeros has been compressed by a factor of 1000.

AFAIK a hole in the filesystem is an area of unallocated space. It is irrelevant whether this space contains zeros, random data or whatever. The filesystem describes it as being free for use. If part of a file larger than the hole is allocated to the hole, the file becomes fragmented. Other parts of the file have to be written elsewhere. Consequently, reading the file becomes slower as the drive has to scan across the hard disk surface to find the rest of the file.

Regards

Nick.

-- 
Gllug mailing list  -  Gllug at linux.co.uk
http://list.ftech.net/mailman/listinfo/gllug




More information about the GLLUG mailing list