Round Earth Data in Oracle, PostGIS, and SQL Server
Today, most spatial databases know how to deal with a round earth. For example, they can tell you the distance between New York and London, or find the area within 1000 km of a point — tasks not normally possible with a flat map. So, I thought it would be interesting to explore a few spatial operations with three of the major databases: SQL Server 2012, Oracle 11g R2, and PostGIS 2.0. (DB2 and Informix also have round-earth extensions, but I won’t cover them here.)
I was focused on two questions:
- Do they all produce essentially the same output? (This gives me confidence about their quality without having to independently determine the “right” answers.)
- As noted previously, only SQL Server models round-earth curves. Is that a benefit?
Experiment 1:
How far is it from Vancouver, Canada to Tokyo, Japan?
I loaded a two-point round-earth line into each system and asked for its length. Here are the (uncontroversial) results:
Oracle | 7572654.086m | |
PostGIS | 7572654.091m | |
SQL Server | 7572658.675m |
Experiment 2:
What is the shortest path from Vancouver to Tokyo?
Here, I’m interested in getting a densified line which allows for (a) comparison and (b) visualization in round-earth-unaware applications.
This is trickier; in each system the round-earth spatial functions have evolved from flat-map ones, and it doesn’t make sense to densify a line on a flat map, so there isn’t a call to do it. My first attempt was to intersect the Vancouver-Tokyo line with a large number of north/south lines and plot the intersection points. This worked well for Oracle and SQL Server, but less so for PostGIS:
Figure 1: Densifying the Vancouver -> Tokyo line via intersections.The issue? For many functions, including intersections and buffers, PostGIS uses a short-cut for round-earth calculations: It looks for a good flat-map coordinate system that covers the area of interest and uses that. There isn’t one big enough for this case, and so it falls back on World Mercator, with the results above.
Motivated to find an answer, I tried an approximation: Cover the earth with a grid at 1×1 degree intervals, find the distance between each point and the line, keep the ones within 200 km, and buffer these together. This produced a consistent result:
Figure 2: 1×1 degree cells within 200km of the Vancouver -> Tokyo line (PostGIS, blue) vs. Oracle / SQL Server result (red)
Experiment 3:
What area is within 1 km of the shortest path from Vancouver to Tokyo?
I asked each database to buffer the line (from Experiments 1 and 2) by 1 km. Only SQL Server produced the correct result; both Oracle and PostGIS followed the path PostGIS took in Figure 1.
Further, if I used SQL Server’s BufferWithCurves() method, it was able to produce a compact result consisting of five arcs.
Experiment 4:
What area is within 100 km of Vancouver’s airport?
I asked each database to buffer a point representing the airport by 10, 100, and 1000 km. The answers in each case were consistent and reasonable.
The majority of the differences between the three database’s answers came down to the density of the output (i.e., how many vertices made up the boundary). Oracle returned more vertices when the tolerance value was decreased (as expected), SQL Server appeared to ignore the tolerance value (unless I buffered as arcs, and then used CurveToLineWithTolerance), and PostGIS didn’t provide an option.
Figure 3: 100 km buffer from YVR; Oracle, SQL Server, and PostGIS produce similar results.
Figure 4: Zoomed in. PostGIS stroking not configurable (green, right). Here, I’ve set the SQL Server and Oracle arc stroking density very high (red/blue, left).With the 100 km buffer, SQL Server’s result was about 4 m inside Oracle’s. With the 1000 km buffer, SQL Server’s result was consistently 4 km inside Oracle’s. Due to the low arc density (and presumably the buffering approximation discussed above), PostGIS’ results varied a bit more.
Experiment 5:
Are there any surprises when using round-earth curves for small objects?
The previous examples demonstrate SQL Server’s ability to use round-earth arcs to represent large areas on the earth (e.g., 100 km around an airport, 1 km around a flight path). Given that SQL Server’s round-earth arc representation is new, it makes sense to ask if it behaves intuitively for small objects.
I digitized a cul-de-sac near the Safe Software office using three-point arcs. Then I wrote the resulting (curve) polygon to SQL Server. I also wrote a stroked version of the polygon (replacing the curved segments with shorter straight lines) to Oracle, PostGIS, and (again) SQL Server. To test that the curved polygon behaved similarly to its stroked equivalent, I asked each database for the polygon’s area:
Oracle (stroked polygon) | 1615.100 m2 | |
PostGIS (stroked polygon) | 1615.100 m2 | |
SQL Server (curved polygon) | 1614.938 m2 | |
SQL Server (stroked polygon) | 1615.100 m2 |
This suggests curved data may be loaded into round-earth columns and processed normally.
Conclusion
Returning to my two opening questions:
> Do they all produce essentially the same output?
Yes. These databases generally return consistent results for round-earth queries. There are minor variations, and not everything works in all cases, so care is required.
> Only SQL Server models round-earth curves. Is that a benefit?
Yes. SQL Server’s round-earth curves allow geometry to be stored more compactly without loss of detail. Perhaps more importantly, curved data can be loaded into round-earth columns without degrading the data (i.e., by converting the curves into lines).
During this exercise, I noticed an error in my previous post. I had thought SQL Server had introduced the buffering-with-curves idea. Not so: For example, Oracle has had this ability for flat-map buffering for a very long time.
Spatial databases can efficiently and effectively store, query, and transform data, all while honoring the curvature of our round world. How are you taking advantage of these capabilities?