the xhtml 2 debate • 2003 Sep 17 • mahiwaga

I don’t know how I stumbled upon the debate raging over XHTML 2 on my web travels, but I found it really intriguing. Very soap-operatic.

Now, I am not a professional web-developer. (Clearly not, if this page is any gauge.) But I have been playing around with markup and with code that reads markup for a while now (I’ve been messing around with Perl and XSLT for nearly three years) and I have come to appreciate the importance of separating content from style. Because code doesn’t care about style at all, it just wants to find content, and it’s a lot easier to parse unique tags than it is to use XPath to try and get you the right non-unique element with a particular style tag. Plus, it’s completely non-intuitive to do this. Why are we trying to parse an element whose only unique characteristic is “font-family: sans”? If you had marked it up as <quote>, for example, it would at least give the programmer an idea of what they’re trying to parse. Also, I am a big believer in the idea that the user should get to decide how things look, not the developer. This is why I think the style attribute should die. If I don’t want that headline in Verdana 18 pt, then I should be able to override it, and that’s difficult if you’ve got style attributes lying around all over the place. More importantly, if I’ve got bad vision, I’d like to be able to change that 6 pt font that you insist I try to read your page in.

Getting rid of the style attribute is especially important now that it is getting commonplace to browse the web with such things as cell phones and handheld computers, things which typically don’t have access to 3,000 fonts and don’t always have access to all the billions of color that a desktop or notebook computer does (and don’t have a lot of bandwidth or memory to retrieve/store tons and tons of style declaration.) Then there are the more esoteric but equally important (from an accessibility standpoint) cases such as reading markup out loud to the blind. In this case, the style attribute is just useless cruft that obscures what should be done to make a particular segment of text standout.

The thing that I don’t get about the debate is how people who want to keep all this cruft in XHTML 2 (most notably the line-break and the aforementioned style attribute) is that they are screaming bloody murder at people who want to get rid of it. Hell, if you don’t want to use XHTML2, keep using XHTML 1.1 then. Or HTML 4. Or HTML 3.2. No one is stopping you, and these things will probably always be supported.

Anyway, I think Hixie makes sense. I’ve been dying to find some sensible way to markup poetry that didn’t involve or, God-forbid, . I mean, maybe this is esoteric, but I think it would be completely legitimate to use an XML parser or XPath expression to search for, oh, let’s say, lines where Anchises is mentioned in The Aeneid. This is feasible if every line is enclosed in <l> elements. This is down-right painful and ugly if you use . In fact, you couldn’t do this reasonably if all you had at your disposal was XSLT. And, while is also parseable, that’s a serious pain to have to type when compared to <l>. The same ideas apply with regards to code listings.

Anyway. I am such a geek.