[Gllug] lots of little files

Peter Grandi pg_gllug at gllug.for.sabi.co.UK
Sun Oct 16 13:39:55 UTC 2005


>>> On Sun, 16 Oct 2005 11:48:23 +0100,
>>> pg_gllug at gllug.for.sabi.co.UK (Peter Grandi) said:

>>> On Sat, 15 Oct 2005 09:23:20 +0100, Richard Jones
>>> <rich at annexia.org> said:

rich> On Sat, Oct 15, 2005 at 01:46:40AM +0100, Minty wrote:

Minty> I have a little script, the job of which is to create a
Minty> lot of very small files (~1 million files, typically
Minty> ~50-100bytes each). [ ... ]

rich> I too would be seriously tempted to change to using a
rich> database.

> To support this rather wise point, [ ... ] First, I have
> appended two little Perl scripts (each rather small), one
> creates a Berkeley DB database of K records of random length
> varying between I and J bytes, the second does N accesses at
> random in that database.

Uhm, rereading all this, while one or two tiny Perl scripts may
suffice, I now realize that the problem ''mostly read only many
or very many small records'' seems to be pretty much what LDAP
was designed for, and it has some convenient aspects that might
help here, and just about any language out there has a LDAP
client library.

Perhaps a look at OpenLDAP might help here, and it uses Berkeley
DB (and a more recent version than 'DB_File') too, so it is likely
to be nearly as fast (as it will have some higher overhead).

[ ... ]

-- 
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug




More information about the GLLUG mailing list