[Gllug] lots of little files
Peter Grandi
pg_gllug at gllug.for.sabi.co.UK
Sun Oct 16 10:48:23 UTC 2005
>>> On Sat, 15 Oct 2005 09:23:20 +0100, Richard Jones
>>> <rich at annexia.org> said:
rich> On Sat, Oct 15, 2005 at 01:46:40AM +0100, Minty wrote:
Minty> I have a little script, the job of which is to create a
Minty> lot of very small files (~1 million files, typically
Minty> ~50-100bytes each). [ ... ]
rich> I too would be seriously tempted to change to using a
rich> database.
To support this rather wise point, as some people have been in
all seriousness trying to offer ''helpful'' suggestions on how
to ''optimize'' the performance of a filesystem subtree with
1,000,000 rather very small files, perhaps it is better to
detail a bit why sensible people would be «seriously tempted»
by a database...
First, I have appended two little Perl scripts (each rather
small), one creates a Berkeley DB database of K records of
random length varying between I and J bytes, the second does N
accesses at random in that database.
I have a 1.6GHz Athlon XP with 512MB of memory, and a relatively
standard 80GB disc 7200RPM. The database is being created on a
70% full 8GB JFS filesystem which has been somewhat recently
created:
----------------------------------------------------------------
$ time perl megamake.pl /var/tmp/db 1000000 50 100
real 6m28.947s
user 0m35.860s
sys 0m45.530s
----------------------------------------------------------------
$ ls -sd /var/tmp/db*
130604 /var/tmp/db
----------------------------------------------------------------
Now after an interval, but without cold start (for good
reasons), 100,000 random fetches:
----------------------------------------------------------------
$ time perl megafetch.pl /var/tmp/db 1000000 100000
average length: 75.00628
real 3m3.491s
user 0m2.870s
sys 0m2.800s
----------------------------------------------------------------
So, we got 130MiB of disc space used in a single file, >2500
records sustained per second inserted over 6 minutes and a half,
>500 records per second sustained over 3 minutes. Now, the
scripts and the database are just quick'n'dirty (e.g. the
needless creation of a string for every record inserted), but it
looks like a good start to me, with nice tidy numbers.
Well, it would be great to compare with the 1,000,000 small
files scheme, but today I am feeling lazy and so offer as to
this only arithmetic:
* The size of the tree will be around 1M filesystem blocks on
most filesystems, whose block size usually defaults to 4KiB,
for a total of around 4GiB, or can be set as low as 512B, for
a total of around 0.5GiB.
* With 1,000,000 files and a fanout of 50, we need 20,000
directories above them, 400 above those and 8 above those.
So 3 directory opens/reads every time a file has to be
accessed, in addition to opening and reading the file.
* Each file access will involve therefore four inode accesses
and four filesystem block accesses, probably rather widely
scattered. Depending on the size of the filesystem block and
whether the inode is contiguous to the body of the file this
can involve anything between 32KiB and 2KiB of logical IO per
file access.
* It is likely that of the logical IOs those relating to the two
top levels (those comprising 8 and 400 directories) of the
subtree will be avoided by caching between 200KiB and 1.6MiB,
but the other two levels, the 20,000 bottom directories and
the 1,000,000 leaf files, won't likely be cached.
If the reader does the math it is pretty easy to see how that
compares, on paper, with 130MiB of space in a single file opened
once and a 2500/s append rate and a 500/s fetch rate...
----------------------------------------------------------------
use strict;
use warnings;
# This is just a rough test, not a proper script.
package main;
use Fcntl;
use DB_File;
my $type = 'DB_File';
my ($name,$entries,$minsize,$maxsize) = @ARGV;
my $upper = $maxsize - $minsize + 1;
$DB_HASH->{'nelem'} = $entries;
my %db;
tie (%db,$type,$name,(O_CREAT|O_RDWR),0666,$DB_HASH)
|| die "$!: Cannot tie '$name' of type '$type'";
while ($entries > 0)
{
my $size = int ($minsize + rand $upper);
my $key = sprintf "%05x",--$entries;
my $entry = "@" x $size;
$db{$key} = $entry;
undef $entry;
undef $key;
}
untie %db;
----------------------------------------------------------------
use strict;
use warnings;
# This is just a rough test, not a proper script.
package main;
use Fcntl;
use DB_File;
my $type = 'DB_File';
my ($name,$entries,$fetches) = @ARGV;
$DB_HASH->{'nelem'} = $entries;
my %db;
tie (%db,$type,$name,0,0666,$DB_HASH)
|| die "$!: Cannot tie '$name' of type '$type'";
my $count = 0;
my $length = 0;
while ($count < $fetches)
{
my $entryno = int rand $entries;
my $key = sprintf "%05x",$entryno;
my $entry = $db{$key};
$length += length $entry;
undef $key;
$count++;
}
$length /= $count;
print "average length: $length\n";
untie %db;
----------------------------------------------------------------
--
Gllug mailing list - Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug
More information about the GLLUG
mailing list