[Gllug] Email programs and mbox format

Daniel P. Berrange dan at berrange.com
Mon Mar 15 11:45:58 UTC 2004


On Mon, Mar 15, 2004 at 11:26:03AM +0000, Sharon Kimble wrote:
> I am currently using Sylpheed-Claws as my main email program and it is
> holding a total of 53,682 emails [which I need to keep for research
> purposes] in mbox format, although the number is contantly growing. The
> only problem is that although I imported one mbox into SC for each
> seperate folder, SC then dissected it into lots of little mboxs which is
> currently taking up about 6.5gigs of space. When there is just one
> mbox for each folder they only take up about 700mb of space. Quite a
> difference!
> 
> I'm now looking for another email program that can import one mbox for
> each folder and will use far less space on my hard drive. I've
> previously used 'Evolution' but am extremely wary of using it again
> because last time its database became corrupted and I ended up losing
> about a months-worth of emails.
> 
> Can anyone advise me in getting another program that can import and
> export mbox files, or assist in getting more of my hard drive reclaimed.

I've come the conclusion that mail programs / filesystem folders are
not so good for extracting information from huge volumes of email.
So something that may be of interest is a perl program I've been
writing to import email into a PG database. The primary schema 
is a fairly highly normalized structure, but i've added a couple
of dernormalizations for convenience. I'm also using the TSearch
v2 data type for full text searching of body text & headers.

  http://berrange.com/~dan/Mail-Archive.tar.gz

Dan.
-- 
|=-               http://www.berrange.com/~dan/gpgkey.txt             -=|
|=-   berrange at redhat.com  -  Daniel Berrange  -  dan at berrange.com    -=|
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 240 bytes
Desc: not available
URL: <http://mailman.lug.org.uk/pipermail/gllug/attachments/20040315/dc9866cc/attachment.pgp>
-------------- next part --------------
-- 
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug


More information about the GLLUG mailing list