could it be that xml is going to be the focus of a backlash of wap-ish proportions. this is only a smattering of the growing list of articles that point out the perfectly obvious – that you’ve got to shove the semantics somewhere. first, clay shirky discovers that xml is no magic problem solver:

“A Magic Problem Solver is technology that non-technologists believe can dissolve stubborn problems on contact. Just sprinkle a little Java or ODBC or clustering onto your product or service, and, voila , problems evaporate. The downside to Magic Problem Solvers is that they never work as advertised. In fact, the unrealistic expectations created by asserting that a technology is a Magic Problem Solver may
damage its real technological value: Java, for example, has succeeded far beyond any realistic expectations, but it hasn’t succeeded beyond the unrealistic expectations it spurred early on.”

and now, dm review is kind to point out that xml, in fact, is no silver bullet:

“XML makes it even more imperative than ever that an enterprise understand and resolve the different words and meanings that it uses to refer to things important to it. To illustrate, people may variously be called customers or clients or debtors – all terms used to refer to people or organizations that buy from the enterprise. These different terms indicate semantic differences. An organization must know the different meanings of its data (as meta data) that are used throughout the business. It must then
define standard terminology – and establish agreed meaning – as integrated meta data to be used by the enterprise. Only then can these terminology differences be resolved so that semantic integrity is maintained.”

but you already knew that because you read the shortest and sweetest treatment of the subject – xml and semantic transparency:

“We may rehearse this fundamental axiom of descriptive markup in terms of a classical SGML polemic: the doubly-delimited information objects in an SGML/XML document are described by markup in a meaningful, self-documenting way through the use of names which are carefully selected by domain experts for element type names, attribute names, and attribute values. This is true of XML in 1998, was true of SGML in 1986, and was true of Brian Reid’s Scribe system in 1976. However, of itself, descriptive markup proves to be of limited relevance as a mechanism to enable information interchange at the level of the machine.

As enchanting as it is to contemplate the apparent ‘semantic’ clarity, flexibility, and extensibility of XML vis-à-vis HTML (e.g., how wonderfully perspicuous XML <bookTitle> seems when compared to HTML <i>), we must reckon with the cold fact that XML does not of itself enable blind interchange or information reuse. XML may help humans predict what information might lie “between the tags” in the case of <trunk> </trunk>, but XML can only help. For an XML processor, <trunk> and <i> and <booktitle> are all equally (and totally) meaningless. Yes, meaningless.

Just like its parent metalanguage (SGML), XML has no formal mechanism to support the declaration of semantic integrity constraints, and XML processors have no means of validating object semantics even if these are declared informally in an XML DTD. XML processors will have no inherent understanding of document object semantics because XML (meta-)markup languages have no predefined application-level processing semantics. XML thus formally governs syntax only – not semantics.”

not that this is meant to imply that there isn’t a productive role for xml in the so-called next-generation web

Leave a Reply