[Gllug] Checking file lists and creating PDF

Richard Jones rich at annexia.org
Sat Jul 5 17:51:02 UTC 2008


On Sat, Jul 05, 2008 at 06:25:25PM +0100, Dylan wrote:
> I've got a directory which now contains around 4000 files. They are scans of 
> the pages of my notebooks, diaries and sketches. Each one is jpeg, they are 
> all the same size (dimensions) and are numbered sequentially from 0001.jpg to 
> 4265.jpg. I have two things I need to to do with these files:
> 
> A - some files have "gone astray" in the processing, and I need to discover 
> which pages have to be re-acquired. Any suggestions as to how to do this 
> easily without trawling through the ls looking for missing numbers?

Easily:

  ls -1 > /tmp/actual
  seq 1 4265 | sed 's/$/.jpg/' > /tmp/expected
  diff -u /tmp/actual /tmp/expected

> B - Once the missing pages have been sorted out, I want to compile them into a 
> set of PDF docs - I think 100 pages per PDF for ease of navigating. What 
> packages should I look at to get this done?

I suspect the commands 'jpegtopnm', 'pnmtops' and 'pspdf' are what
you're looking for.  You can use these to assemble the JPEGs firstly
into a Postscript file which is then easily converted to PDF.  You'll
be wanting to read the pnmtops manpage in particular very carefully.
Note that Postscript files are just text, and can be concatenated
together with care.

Rich.

-- 
Richard Jones
Red Hat
-- 
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug




More information about the GLLUG mailing list