[Gllug] pdf conversion to HTML
Richard Cohen
richard at vmlinuz.org
Thu Jan 31 13:20:01 UTC 2002
On Thu, 31 Jan 2002, will wrote:
> I am sure there is one but I can't remember the name. Is there a
> command line tool for converting pdf files to HTML?
Um...
pdf2html? :-)
It's based off xpdf, but the version of pdf2html out there is built against
an old version of xpdf. I hacked it into the source for an up-to-date xpdf
because I wanted a) better PDF parsing and b) to turn off recognition of the
"don't copy" bit in the headers.[*]
I can't actually find a homepage for the pdf2html I've got, and my home
machine is turned off, so I can't look at it right now. There does appear
to be *another* pdf2html, which simply does a dump of the PDF to images (one
image per page) and makes HTML which displays the page. That's *not* the
one I've got at home, which actually converts PDF text to HTML text, pretty
well.
> Will.
Cheers
Richard
[*] I don't think there's anything wrong with buying a PDF and 'ripping' it
to HTML so I can view it on my palmtop. For personal use only...
--
Gllug mailing list - Gllug at linux.co.uk
http://list.ftech.net/mailman/listinfo/gllug
More information about the GLLUG
mailing list