[Lancaster] meeting, rtf to html & demos.

andy baxter andy at earthsong.free-online.co.uk
Thu Apr 2 22:30:25 UTC 2009


max* wrote:
>> How about if I make the effort to write a script to turn the xhtml
>> produced by open office's export function into decent html? Would this
>> be useful to you, and do you have any requests for what it should do?
>>     
> i think a lot of the world are waiting for something like this... as martin 
> pointed out htmltidy is now being maintained again, so it may be possible to 
> just string together something?
>   
Not sure if htmltidy would be right for this job - it looks useful for 
other things, but openoffice does write well-formed xhtml, and htmltidy 
seems to be mainly for cleaning up badly-formed markup tags. (E.g. a 
<li> with no </li>)

The problem with openoffice's xhtml is that, for example, instead of 
using the standard <i> and <b> tags for italics and bold, it defines CSS 
classes with italic and bold formatting, then applies these classes to 
<p> and <span> tags. E.g. <span class="T43"> might mean 'format in 
italic and bold (but not underline)'.

So even though it is perfectly valid html which any web browser should 
interpret properly _for a stand-alone document_, it's not so good if you 
want to display the document as part of an existing website, using your 
standard stylesheets. Or if you want to apply global formatting changes 
to the document.
> i'd be happy to test out whatever you write on my frequent conversions of the 
> guides and briefings at seeds for change!
>
>   
I've had another look at it and it looks possible, but probably harder 
than I first thought. Will have another think about it next week.
>> Also, have you tried exporting to LaTeX then converting that using
>> latex2html? I had a go at this but latex export in my open office is
>> broken.
>>     
> needs java, and i haven't sorted that out yet.
>   
I have java and it's still broken, even though xhtml export works ok 
(which also uses java.)

andy



More information about the Lancaster mailing list