Working with XML … and Loving It!
We have spent a great deal of effort working with XML. Historically, our feelings towards XML were best described as love-hate.
We loved it as it was good for business and it gave us lots of technical challenges. We hated it because it was hard and while we made progress on the reading side the writing side remained elusive.
The power of XML is of course its ability to encode and model arbitrarily complex entities and the relationships between these entities. Its domain neutrality means that XML is used to encode information across virtually all industries. The difficulty in dealing with XML documents is that in many cases the XML data models are done with no regard for the software engineer who may have to work with or use the data. Note that here that I am saying “work with” or “use” the data. Reading XML data is easy, using can be difficult.
The thing that has made XML hard for us is that we have tried to treat XML like relational data and it is of course not relational. Once we instead approach XML in a much more free flowing approach, then it becomes way less hard. Take metadata for example. James Fee recently wrote about metadata and the challenges of how to make it accessible. When asked about metadata in the past we often made the joke that “We have never met a data we haven’t liked!” while at the same time being open that “We have never met a metadata that we knew what to do with!”. Again the thing here was that we again thought of metadata as a type of data. So far so good; it is data about data. But again we tried to make it fit into a relational workflow scenario which does not work. In fact metadata varies significantly from site to site even if it is in the same standard: FGDC or ISO. The conclusion here was again the same. As XML (in this case containing metadata) doesn’t fit into the relational paradigm at all we shouldn’t be attacking it in that manner. Yet this is precisely what we had been doing!
Things we have learned:
- XML is not relational and trying to approach it in a relational manner is doomed to failure.
- The amount of effort required to get data into an XML schema is directly related to the complexity of the schema. The more complex the schema the more effort is going to be required to structure the data to fit into the XML schema.
What is really needed to process XML is an environment that makes it really easy for people to build up complex objects in a gradual and understandable fashion. As luck would have it, Workbench fit this paradigm perfectly and is a great environment for this, but in the past it lacked the necessary supporting transformers for building XML documents. Enter FME 2010 (FME 2011 beta is even better) and the XML Templater and now you can Get Smart with XML.
So how do YOU feel about working with XML now? Me? I’m using XML… (take me out Don Adams)