Choosing the Right Data Format
As Dale pointed out, with every new version of FME, we add support for new formats. Take some time to read Dale’s post, it’s a very good read. As Dale concludes, like it or not, the shapefile is here to stay, and new formats will, inevitably, continue to be invented.
So with all these formats to choose from, how do you find the right one? Shapefiles have become ubiquitous in the GIS industry, but does that mean a shapefile is always the best way to distribute our data? While I know what to do with a shapefile, and gis professionals know what to do with a shapefile, imagine not knowing anything about gis, making a request for data and being directed to an ftp site with a zip file containing .shp, .prj, .dbf, etc… files. If you are not a gis user, you may have no idea what to make of this collection of files. Chances are these non-gis people do not have a shapefile reader on their computer. These users are effectively stuck behind a wall between them and the data they need access to. In this case, getting over the wall is not too hard, but there is still a wall which does not need to be there.
I just watched an excellent video from the TED conference this morning by Sir Tim Berners-Lee. Tim is best known for inventing the web 20 years ago while an engineer at CERN. His 15 minute talk (TED is a conference aimed at bringing some of the best minds in Technology, Entertainment, and Design together to give 15 minuet talks to each other) is focused on where he sees the web going in the very near future. He envisions a world where the web is not about documents, but about information which can be queried and mashed up into whatever you like. He shows an example of a very difficult query he entered into Google, which returned thousands of documents which he would then have to pour over to get the information he needed, and a query against “linked data” which returns the exact information he is looking for. And yes, for all the geo-nerds, he does mention geo data as well, naming Open Street Map as a great example of a source of data.
What I found particularly interesting about his talk were his reasons for inventing the web:
“Why did I do it? Well, it was basically frustration… I was working as a software engineer in this huge very exciting lab, with lots of people coming from all over the world, they brought all sorts of different computers with them, they had all sorts of different data formats, all sorts of kinds of documentation systems. So in that diversity, if I wanted to figure out how to build something out of a bit of this and a bit of this, everything I looked into, I had to connect to some new machine, I had to learn to run some new program, I would find there may be some information I wanted in some new data format and these were all incompatible.”
Where do I know that story from?