[Gllug] Acceptable HTML
tet at accucard.com
tet at accucard.com
Fri Jun 14 07:37:03 UTC 2002
>Without downloading and installing a massive office suite, if some
>clueless soul has decided to "save as HTML" from Word 2000, what's the
>best method to get it into a state that HTML Tidy is happy to look at?
>
>This particular document is scattered with tables and diagrams, and so
>far neither Abiword or KWord has been even vaguely pleased to see
>it....
Mozilla (and Mozilla's composer) will happily load it. Of course that
doesn't help you convert it to a usable format...
>ObLinux: I only ask here because I want to be able to edit it properly
>in emacs/vi on my Linux desktop machine. :)
You can get it to a usable state (at least one that can be further fixed
with HTML tidy) by using:
sed 's,<o:p></o:p>,,' file.mshtml > file.html
Tet
--
Gllug mailing list - Gllug at linux.co.uk
http://list.ftech.net/mailman/listinfo/gllug
More information about the GLLUG
mailing list