Is XML the Silver Bullet for Data Exchange?
I have been in technology long enough (since the early 1980’s) to see a number of “Silver Bullets” that were going to change the world. The question I’d like to examine a bit more closely here is, whether or not XML is the silver bullet for data exchange?
Early “Silver Bullets”
My first experience with silver bullets was the object oriented database! They were the new wave that were going to leave relational databases in the dust. Anyone remember or use any of those databases? For me the best thing that came out of object oriented databases was that Dale was on the same project and this is where we met and went on to found Safe Software! Thank you OODB!
I also remember very well when Java was introduced. C++ was said to become obsolete and so everyone was told to start learning Java or they would find themselves relegated to legacy software maintenance tasks. It goes without saying that Java was and is very successful; however it didn’t by any means kill C++. C++ is still widely used and isn’t going away any time soon. Here at Safe we use C++ for the FME Engine, Workbench, and other desktop technology. Meanwhile, we use Java for many of the FME Server components. It goes without saying that we would not be able to get the same high performance levels out of our FME Engine; which powers both FME Desktop and FME Server, if it was written in Java.
In short, neither the object oriented database nor Java were silver bullets. One remains a specialty product and the other while, widely successful, was not suitable for all programming tasks. Indeed the number of languages continues to grow with Python being just one example of another popular and growing language.
What about XML? Is it the Silver Bullet for Data Exchange?
With my love of XML, the question I’ve been pondering is, whether or not XML is the silver bullet for data exchange?” XML has a lot going for it and there is a lot of data from many industries that are using it or are moving towards using it.
There are also lots of tools for working with XML. For exchanging data between disparate systems it seems like it is the perfect solution. So, what are the weaknesses of XML?
XML and Performance
At Safe when it comes to building product we are obsessed with three things: quality, usability and performance. If we can improve these with each release then we are doing a lot of things right. I have never met a user who said that they do not care about any of the above.
Aside from XML I also spend a great deal of my time working alongside the FME Server team. As mentioned above, FME Server consists of a number of Java components which exchange data between them using XML messages. With FME Server one of the performance measures we watch is the number of jobs per second the core pass off to FME Engines. We want this number as high as possible and to increase with each release of FME Server.
It turns out that while beautiful (which XML always is!) using XML for this data exchange is not cheap. During FME Server stress-testing we found that much of the time was being spent on creating and parsing these XML messages! Doing more research we discovered that if we replaced the XML messages with Google Protocol Buffer technology, that we could process these messages 20 to 100 faster.
Using Google Protocol Buffer technology also meant that the messages themselves are 3 to 10 times smaller (higher information density) meaning we could exchange more data in the same time. In comparison, XML has a very low information density. To get a feeling for this simply grab an XML/GML file and compress it.
Doing this makes a huge difference in the number of jobs/second that our FME Server core can handle, increasing the maximum number of jobs that we can process from hundreds per second to thousands per second.
Anyone who has worked with XML knows that it is many things, but it wasn’t designed to be compact or quick to parse.
What Does this Mean?
Like anything, it means that you should really evaluate what you are trying to accomplish. If your primary concern is to share data in an easy to understand structure with many other loosely coupled systems, then XML is a great choice.
However, it should be recognized that nothing is free and that you are giving up performance, both in terms of cost to process XML data messages and the size of the XML data that is sent.
If your goal is to build distributed systems in which moderate to large amounts of data are to be passed around with tight time constraints, then the choice of XML really needs to be questioned. Personally, I get really nervous when folks talk about moving huge amounts of XML with very tight time constraints in interactive systems.
Even though XML continues to grow, and its future looks bright, you should always first define the goals of your system and only then select the best technology to support those goals. Selecting technology first and then defining the goals of the system always has been and always will be a very dangerous approach to building successful solutions.
As someone who loves XML, my feeling is that when it comes to the silver bullet question, XML is more of a Java than an object oriented database. Just like Java there are places where XML is not the right choice.
So while XML is not the silver bullet for all data exchange, that hasn’t dampened my enthusiasm for it. If you have any XML that you’re having problems working with; please do send it to me at xml@safe.com. I’d also be curious to hear about your experiences with any other silver bullets.