[Sussex] John's Stupid question of the week!

Tony tony at gigaday.com
Sun Oct 12 07:13:09 UTC 2003


John

>From my experience you have a couple of quite serious hills to climb with
this; firstly you need to think about concepts.

If you are going to scan all this stuff, this could be a fairly major
exercise in itself - scanners are quite slow and if you want any sort of
throughput you will need a document feeder.  One of my customers has a
fairly handy gadget with a document feeder that scans straight to cd or
hard disk; can't remember what it cost but it must have been at least
£2,000.  Were you thinking of spending this sort of money?

Your next consideration is how to index it all.  You could try OCRing it
but my experience is that this will not work well enough to be useful.  An
alternative would be a keyword system, where you have to invent the
keywords yourself.  I would opt for the keywords, but this may not be such
a minor task either as you will have to at least speed-read the documents
to decide on keywords.

Unless you plan to do this on an industrial scale, I would forget about
scanning and use a filing cabinet and concentrate on devising a keyword
index that helps you to find the paper documents.  Then you can keep your
index on the computer and search for keywords with grep.

Tony





> Hi List,
>
> It so happens that my Clare, being a teacher, has quite a lot of
> "resources" to assist her in her work. These are mainly paper based, i.e.
> magazine articles and the like, "work sheets" and that sort of thing.
>
> It has crossed my mind that it would be a useful thing to have these on
> "disc", and possibly searchable. Now, even with my "pea sized brain" I
> understand the words data and base, but that's about where it stops.
>
> So, could someone offer advice as to how I would go about compiling a
> database?
>
> In that, I mean, what tools/software would I need to look into? How do
> these
> things actually work, as in do I need to have "software a" that saves a
> document or image in "format b" at "location c" ?
>
> What is the best format to save things in, bearing in mind that I imagine
> the best way to get the "stuff" on disc is going to be scanning, but it
> would need to be in a format that could be edited and/or ocr'd later
>
> Regards
>
> John D.
>
>
> _______________________________________________
> Sussex mailing list
> Sussex at mailman.lug.org.uk
> http://mailman.lug.org.uk/mailman/listinfo/sussex
>





More information about the Sussex mailing list