[Lancaster] meeting, rtf to html & demos.
andy baxter
andy at earthsong.free-online.co.uk
Thu Apr 2 22:30:25 UTC 2009
max* wrote:
>> How about if I make the effort to write a script to turn the xhtml
>> produced by open office's export function into decent html? Would this
>> be useful to you, and do you have any requests for what it should do?
>>
> i think a lot of the world are waiting for something like this... as martin
> pointed out htmltidy is now being maintained again, so it may be possible to
> just string together something?
>
Not sure if htmltidy would be right for this job - it looks useful for
other things, but openoffice does write well-formed xhtml, and htmltidy
seems to be mainly for cleaning up badly-formed markup tags. (E.g. a
<li> with no </li>)
The problem with openoffice's xhtml is that, for example, instead of
using the standard <i> and <b> tags for italics and bold, it defines CSS
classes with italic and bold formatting, then applies these classes to
<p> and <span> tags. E.g. <span class="T43"> might mean 'format in
italic and bold (but not underline)'.
So even though it is perfectly valid html which any web browser should
interpret properly _for a stand-alone document_, it's not so good if you
want to display the document as part of an existing website, using your
standard stylesheets. Or if you want to apply global formatting changes
to the document.
> i'd be happy to test out whatever you write on my frequent conversions of the
> guides and briefings at seeds for change!
>
>
I've had another look at it and it looks possible, but probably harder
than I first thought. Will have another think about it next week.
>> Also, have you tried exporting to LaTeX then converting that using
>> latex2html? I had a go at this but latex export in my open office is
>> broken.
>>
> needs java, and i haven't sorted that out yet.
>
I have java and it's still broken, even though xhtml export works ok
(which also uses java.)
andy
More information about the Lancaster
mailing list