One of the benefits of storing spatial data in database systems is the ability to efficiently query data by location. For example, “where are all the fire hydrants within a certain distance of a given point?” Typically, a variety of spatial interactions are supported, meaning that you can find all the geometries that interact with a query geometry, or only the ones that are within, or touch, or are equal to it. A great deal of research has gone into defining different spatial interactions, resulting in something called the DE-9IM model, and eventually a set of standard predicates endorsed by the OGC®.
There are a few ways one might want to leverage this spatial querying ability. One common use is to select an area of interest when extracting data from a large spatial table. Another possibility is to do a spatial join, e.g. “find all of the parks that are within a certain distance of a school”. Some spatial ETL (extract, transform and load) tools, like FME® can already accomplish this.
We are also interested in a third case, spatial querying during an ETL process. What spatial ETL does is read spatial data from somewhere, transform it into the required format and data model, and then write it somewhere else; in this context, querying can help with the transformation step. While we’ve done this already with Oracle® and ArcSDETM, we’re excited about providing this capability for the other spatial databases we support.
I’ll motivate this with an example. Imagine that a user wants to load a small number of new features from a MapInfo® file into an Oracle database, but wants to ensure that the new features are not already there. They could design an ETL process that reads features from the input file, and for each feature, queries the database to determine if there are any nearby features with similar attributes. If features were found, the potential duplicate could be flagged for further investigation.