[Gllug] Data migration and electronic archaeology

mriscott at yahoo.co.uk mriscott at yahoo.co.uk
Tue May 21 16:50:10 UTC 2002


> Isn't this the sort of problem that XML is good for? SGML has been
> around for a while, so there's no reason to think that DTDs are going
> to become obsolete overnight, and anyone with a brain can work out
> most of what a DTD means given an example document and a basic
> understanding of regular expressions.
> 
> ASCII isn't going away in a hurry, and Unicode is a standard too, so
> the data should remain easily readable for quite a while. The only
> problem is the degradation and obsolensence of the storage medium if
> the data is saved as XML.
> 
> Of course, XML isn't a "one size fits all" solution, but it does
> help. This just seems to be paraphrasing the mantra "don't get locked
> into proprietry technology". Open, documented standards are good not
> just because of philosophical reasons....

Ah, well - ASCII/xml is easy!

The real fun is decoding obselete binary formats.

I used to work for a company who specialized in getting old oil well
log data (usually in wierd old formats) off old tapes,  converting it
to something sensible, and putting it onto shiny new CDs.

We wrote a load of programs to convert the old formats to new ones.

And the real fun was decoding the ones that were written *nearly* to
the specification of a format - we had loads of "special cases" to
decode different companies' interpretations of the format!

<rant>
And don't get me started on XML!  We had a "business reason" to convert
some data to XML (ie. it was a cool buzzword) and make them available
via the web.  XML just isn't up to storing large volumes of data - it
is way too bloated a format - binary data requires a binary format.
</rant>

The great thing about tapes is that left to themselves, they decay
- and if you have a system to read them to check whether they've decayed,
they decay quicker!  You can do a certain amount by reading them at a
slower speed, etc - but after a while the data is just gone.

Of course, in the "real world", people do insist on using strange binary
formats such as MS word docs. I wonder how easy it is to read a word 3 doc
these days? (Given Word97 couldn't read word6, I doubt it'll read really
old ones!).

Just my unconnected ramblings,

Ian

----
In the force if Yoda's so strong, construct a sentence with words in
the proper order then why can't he?




-- 
Gllug mailing list  -  Gllug at linux.co.uk
http://list.ftech.net/mailman/listinfo/gllug




More information about the GLLUG mailing list