[Wylug-help] PS generated file -> pdftotext -layout and images

Gary Stainburn gary.stainburn at ringways.co.uk
Wed Oct 7 09:40:18 UTC 2015


I've not been able to get anywhere with pdftotext or PDF OCR software so I'm 
looking OMR using image comparison.

So far, I have:
converted each page into PPM (PDFTOPPM)
for each page cropped each line into a separate PNG file
I also have got images for each field containing the label and a tick and 
another containing the label and a cross.

Is there an easy way to detect if the label+tick PNG image is contained in the 
line PNG? (it doesn't have to be a PNG)

On Tuesday 06 October 2015 11:28:11 Gary Stainburn wrote:
> Hi folks,
>
> I've managed to get pseudo printers working with CUPS to allow Windows PC's
> to print a document which then gets converted to PDF, has stationery
> applied to it, and the result emailed back to the requesting user.
>
> I'm now on phase 2 of the project which uses
>
> pseudo printer -> ps2pdf -> pdftotext -layout
>
> to generate a text file which then is used to import data into my systems.
> This method is because the system that generates the report doesn't have an
> export facility.
>
> My only problem with this is that the generating system uses images of a
> tick and a cross to indicate some boolean values.
>
> Can anyone suggest how I can convert these ticks and crosses back to their
> boolean value?
>
>
> _______________________________________________
> Wylug-help mailing list
> Wylug-help at wylug.org.uk
> https://mailman.lug.org.uk/mailman/listinfo/wylug-help




More information about the Wylug-help mailing list