[Malvern] Document Storage
Andrew Morris
zaglabod at btinternet.com
Mon Sep 17 11:24:35 BST 2007
Ian,
From experience, starting from scratch with a pile of past documents is
time-consuming ... you really don't realise just how much time it takes
to scan in a multi-page document. Start with current documents, as they
appear, and add in a bunch of back documents to each session. With time,
you will clear the backlog, and build a library.
Any scanned document makes for a much bigger file than an original print
file. Take any bill PDF from BT or equivalent, print it and scan the
result. The scan is easily 10-20 times the file size. OCR doesn't cope
easily with anything esoteric as a font. Plus, OCR'ed stuff cannot be
used in legal circumstances, but scanned stuff can be; it's how some
evidence has been stored for years (used to be microfilm, now it's
high-res digital).
I scanned everything into PDF on an HP PSC 1210 all-in-one scanner and
printer, generally produced about 500k-1M per page, which sometimes
compressed a bit in 7zip, but not always. That continued until the
scanner software hit some sort of timeout (about 3 years) at which point
all the HP Director stuff refused to work (won't start), independent of
the platform it was loaded on (98, 98SE, ME, XP, Vista). Now I can only
do JPG from the scanner button. Don't know why, HP refused to talk to me
after I gave the question and then the age of the scanner. Works fine on
XSANE on Linux, but only to 600dpi. Windows would alias upto to 2400.
You can cut the size of file by choosing your destination media. If you
want to be able to reprint, then you need 300 dpi; if you only want to
read on a screen, then 150dpi is adequate, even allowing for future
resolution upgrades on displays.
I double-store everything current (up to 24 months) on HD and CD/DVD.
Burn the CD/DVD at low speed for long-term reliability (as you would do
for any archival stuff). Prune the HD at 24 months.
Andy
Ian Pascoe wrote:
> Morning Folks
>
> In the list's opinion which is the best way to store documents?
>
> In particular, as my own filing system is, well non existant, I was thinking
> about scanning all necessary documents and then storing them eithre to HD or
> CD / DVD.
>
> I've been trying to work out in my own mind what would be the better way to
> store these scanned documents that will maintain the clarity and be of
> minimal size.
>
> So far it's looking like storing them as a tiff image, but I'm not sure
> whether it's worth the time to push them through an OCR tool and into an
> appropriate document format. Either raw or compressed through something
> like 7Zip into a self extracting file, or such like
>
> It is not necessary for the stored images to be "legal" copies but are
> merely there for my own reference.
>
> Thoughts, apart from sorting out my paper filing system!
>
> E
>
>
>
> _______________________________________________
> Malvern mailing list
> Malvern at mailman.lug.org.uk
> https://mailman.lug.org.uk/mailman/listinfo/malvern
>
>
More information about the Malvern
mailing list