[Sussex] Converting OpenOffice documents to XHTML

Geoff Teale gteale at cmedltd.com
Wed Nov 10 17:03:41 UTC 2004


Steve,

You might be better off writing some code in OO.org  itself to do this -
iterating over the sections is going to be less hairy than trying to do
XSLT or some other XML level transformation.


On Wed, 2004-11-10 at 16:53 +0000, Steve Dobson wrote:
> Geoff
> 
> On Wed, Nov 10, 2004 at 04:34:50PM +0000, Geoff Teale wrote:
> > Steve,
> > 
> > More detail please .. I'm now very experienced in coding in
> > OpenOffice.org but I'm not quite clear about what you are trying to do
> > here.
> 
> They currently have paper based process backed up by a manual.  They are
> currently expanding the manual and with lots of new stuff but we have
> it on the idea of converting this to a "web application".
> 
> The manual is going to be converted into a set of XHTML pages. 
> Rather than have them cut & past the text into the database in a slow,
> mandrollic process I would much prefer to extract on the structure
> of the document (now in OpenOffice) and load that data into the database.
> 
> For example where the document reads:
> 
>   1.1   Heath and Safety
> 
>   1.1.2.  First Aid Kits
> 
>         The first aid kids need to be checked weekly to ensure each is
>         stocked with the appropriate stuff.
> 
> 
> I would like to turn that into something like:
> 
>   <section>
>      <title>Heath and Safety</title>
>      <subsection>
>         <title>First Aid Kits</title>
>         <text>
>             The first aid kids need to be checked weekly to ensure each is
>             stocked with the appropriate stuff.
>         </text>
>      </subsection>
>   </section>
> 
> That way I could then parse this new XML using PHP extract the various
> bits and insert them into the appropriate rows and columns in the database.
> 
> I haven't yet designed the database so I am flexible on who this is best
> to be done.  I'm looking here for the best way of doing this.  I've seen
> stuff on XML style conversion, but is this the way to go?
> 
> Steve 
> 
> 
> _______________________________________________
> Sussex mailing list
> Sussex at mailman.lug.org.uk
> http://mailman.lug.org.uk/mailman/listinfo/sussex
> 
> _____________________________________________________________________
> This e-mail has been scanned for viruses by MCI's Internet Managed Scanning Services - powered by MessageLabs. For further information visit http://www.mci.com
> 
-- 
Geoff Teale <gteale at cmedltd.com>
Cmed Technology





More information about the Sussex mailing list