[Glastonbury] next meeting

Martin Wheeler mwheeler at startext.co.uk
Wed Dec 1 17:21:25 GMT 2004


On Wed, 1 Dec 2004, Ian Dickinson wrote:

> The main difference is that HTML is not strictly
> conformant to the XML standard.

A quaint way of putting it (given that the first HTML DTD was devised in 
late 1991, and proposals for XML didn't appear on the table for a further 
four or five years -- after webfolks had sussed out what they really 
wanted from a hypertext markup language, and what had to be done to stop 
Microsoft from riding roughshod over international standards); but yes, 
XML was designed to tighten up some of the looseness inherent in Hal 
Burgiss's first attempt.  [A looseness inherited from the mother 
metalanguage SGML, btw.  E.g. optional closing tags. Etc.]


>  XML insists on a
> number of syntactic rules

Err . . no.  Not quite.  More exactly, //SGML// insists on correct syntax 
-- and XML is written in SGML, to be strongly conformant.  XML was in 
part, a response to the ghastly mess created by the browser war between 
Microsoft and Netscape, where each outdid the other in trying to produce 
browsers that would read almost anything, marked up correctly or not.
Hundreds of thousands of web page editors have suffered as a result, 
having no clear idea at all of what is good hypertext markup, and what is 
syntactic garbage.  (See any local college course. Some of the trash being 
taught by people who have learned markup from reading other folks's [bad] 
HTML is almost beyond belief.)


>   For
> example, this would be valid HTML:
>
> <p>one paragraph

  . . . BUT ONLY IF THE DOCTYPE DECLARATION IN THE FIRST LINE (you *do* 
always include a DOCTYPE in your markup, don't you?) SPECIFIES AN HTML DTD 
WHICH ALLOWS THIS.
Some don't.  (Some of Softquad's DTDs don't, for example.  Others do :)

> <p><b>bold <i>bold italic</B></i>

//Sorry, but I really don't know of *any* DTD that would permit the above 
crossover between tags.//

(I know of plenty of browsers that will read it without demur though.
You may even have a crap parser that will accept it without boaking.  That 
still doesn't make it valid code for any DTD that I know of, though.)


> But it breaks lots of XML rules, so to be XML
> conformant it would have to be re-written:
>
> <p>one paragraph</p>
> <p><b>bold <i>italic</i></b></p>

Ummm . . what XML DTD are you citing?
For DocBook XML markup that would have to be: <para> .. </para>
And: <emphasis role="strong"> .. </emphasis> .. etc.
Or are you thinking of W3C XHTML, not any specific XML DTD?


> In fact XHTHL goes further even than that, in that it
> has removed some HTML presentation elements and
> attributes.

Aaarrrggh!
[Look -- I have a heart condition, right?  If there's one thing I *cannot* 
stand, it's misrepresentation of fact in the history of HTML markup.]
So ...
In the full spirit of SGML, HTML markup -- before it was buggered up by 
clueless, ignorant users steeped in applemack arty-fartiness where the 
content/structure dichotomy simply doesn't exist and cannot even be 
comprehended -- NEVER purported to represent presentational information.
Got it?
IT HAS ALWAYS BEEN A MEANS OF STRUCTURAL REPRESENTATION.
RIGHT FROM DAY ONE.
Anything else was introduced as an attempt by clueless gits who didn't 
understand the first thing about structural markup, and thought that 18pt 
red bragadoccio on an orange background was the best thing since ponytails 
and inch-wide red braces.
[Sheesh!  Quick, where's me pills?]


> So that the XHTML document defines the
> information structure and content, while the
> presentation (font style, size, colour, background
> colours, etc) is entirely managed through style
> sheets.

Oh, Buddha give me strength!
This was //ALWAYS// the case, in *ANY* SGML-derived markup language.  From 
the very first HTML DTD.
The concept was apparently just too fine for most users to grasp, however.
Particularly if they came from the applemack-designer camp.
Thus the myth that you faithfully -- but erroneously -- reproduce above 
was born.


> Historically, browser engines like IE and Netscape
> have been very tolerant of ill-formed HTML

Deliberately so.  In a blatant attempt to wreck an international standard, 
and wrest control of ownership of markup standards into commercial 
proprietary hands.
(In total despair, Tim Berners-Lee wrote a wonderful essay on the 
subject.)


>  But the W3C
> is encouraging all designers to abide by the much
> stricter XHTML rules (using a validator to check
> conformance as necessary), so that content *should* be
> presented consistently irrespective of the browser
> platform.

The W3C has *ALWAYS* encouraged markup editors to use conforming markup -- 
HTML or XHTML.  And have always provided an online validator.


> Beyond, XTHML, as Martin alluded to, there are further
> rules about using markup in a way that assists
> disabled users.  These rules now have the force of law
> in the USA - there's a deadline (I forget when it is)
> for certain classes of public information, e.g. on
> corporate and government sites, to be conformant to
> the rules for accessibility.

Andy -- would you care to comment on the *reality* of this (particularly 
in the UK, where last year the Govt. gave Microsoft £18M to produce an 
e-government site that was only readable by IE?  //True.//  It took six 
months to change it.)

There is no legal requirement in the UK -- but I have a copy of the 
government's guidelines for local government and public service websites 
if anyone's interested.  They were used for the creation of the 
glastonbury.gov.uk site.


> In my view, it should be a pretty important criterion
> when selecting a CMS whether that tool generates good,
> clean, standards-conformant XHTML.  Your site will
> still be usable if the tool doesn't do that, but
> you're saving yourself a whole bunch of future trouble
> if it does.

Agreed absolutely.
Getting back to the original point -- phpWebSite produces markup 
conforming to W3C XHTML.  Maybe I should have specified this in the 
original posting.

I suspect plone does too -- but nowhere do they claim so.
And the stuff mambo produced a few months ago was a bit iffy, to say the 
least.

Cheers,
-- 
Martin Wheeler   -   StarTEXT / AVALONIX - Glastonbury - BA6 9PH - England
mwheeler at startext.co.uk                http://www.startext.co.uk/mwheeler/
GPG pub key : 01269BEB  6CAD BFFB DB11 653E B1B7 C62B  AC93 0ED8 0126 9BEB
       - Share your knowledge. It's a way of achieving immortality. -


More information about the Glastonbury mailing list