powered by Authenteo
by Xucia  

Learn more

XML vs JSON

It seems that there is a lot of inappropriate use of XML today. Over the last decade XML has become a huge buzzword, promising greater data interchangibility. Many have jumped on the bandwagon to serialize data with XML, however it seems that most have missed the point of a XML and used in completely the wrong ways. 

Let me just say right now, XML is not (or should not be) a data structure serialization format. It is markup (that is what the "M" stands for!), and markup is not intended to define data structures, it is intended to give semantic meaning to parts of documents. XML has inherent ambiguoity in describing data structures. The prime example is that without a DTD (or schema) it is impossible to determine if a given field is an array (capable of repeating values) if the it only has a single value.

Let me give an example of appropriate XML usage:

<h1>XML vs JSON</h1>

<p>XML is <em>not<em> a data structure serialization format!</p>

This is a basic document in which the XML has given semantic meaning to the different texts. Compare this with the following example where I have constructed an object programmatically:

var page = new Object();

page.title = "XML vs JSON";

page.body = "XML is not a data structure serialization format!";

var relatedDocuments = new Array();

relatedDocuments.push( http:// www.json.org );

Now if we serialize this document with XML, the simples approach would probably look something like this:

<page>

<title>XML vs JSON</title>

<body>XML is not a data structure serialization format!</body>

<relatedDocuments>http://www.json.org</relatedDocuments>

</page>

However, if you read this XML data, there is no (precise) way to know that relatedDocuments is actually an array, other than your knowledge of English that suggests that it is plural so it might be an array. You could have written relatedDocuments like this:

<relatedDocuments>

<relatedDocument>http://www.json.org</relatedDocument>

</relatedDocuments>

But, we basically we are resorting trying to come up with a convention (that is not defined in XML) to define our arrays. Our basic problem is that XML has conceptually different purpose than to define data structures.

JSON on the otherhand is very adept at defining data structures. The previous object can be easily serialized with no ambigiouities, in a very readable, and more compact fashion:

{"title":"XML vs JSON",

 "body":"XML is not a data structure serialization format!",

 relatedDocuments:[ http://www.json.org ]}

This clearly shows that the relatedDocuments is an array.

I would say that by far the most signficant, useful, important form of XML is (x)HTML. The allure of XML is data interchange. Unfortunately, while most XML may allow the data provider and consumer to agree on the validity of the data, actually understanding the XML is entirely different matter. Until you understand the meaning of the tags used by the provider, XML is fairly useless. HTML on the otherhand provides an incredibly broadly agreed upon set of semantics. Almost anyone who has dealt with HTML understands that <h1-6> are headers of diminishing significance, <p> is a paragraph, and <em> indicates emphasis. Innumerable amounts code exist that also understand the semantics of HTML. HTML is brilliant example of XML at its very best, fulfilling its promising of well-understood exchange of documents with agreed upon semantics, it one of the only forms XML that really delivers. Too bad most of the HTML in world really isn't in valid XML, but syntantical strictness really isn't the primary benefit of XML.

News

Authenteo 1.1 is available.

Firebug - Web Development Evolved Now with Firebug integration . Make changes to CSS and HTML with Firebug and save the changes

Check out press releases and the following articles on Authenteo:

Authenteo beta