Understanding the terminology for location data can be helpful for understanding what goes on behind an easy online search for your friend’s new address, or more importantly, when you’re working with data such as a CAD drawing that needs to be tied to a location for further analysis. What is geocoding? Or georeferencing? This blog post will dive into what these terms mean and will highlight the difference between them as well. I will also explain how they are used and how it all makes sense in the context of FME.
Geocoding: Converting Places to Coordinates
Geocoding is a process that converts an address or place name into an X,Y point coordinate location on the Earth.
A Google Search for ‘Safe Software Inc’ and the resulting pinpoint on the map on the bottom.
A screenshot of FME Workbench’s Visual Previewer outlining the inputs and outputs of the Geocoder transformer.
It’s a translation from something that a human would recognize like a street address or a well-known building name into a message that a coordinate system can recognize and then be able to reference a corresponding point on the map. That means that your last Google, Bing, or other map search for a location found your result by using geocoding services.
Who uses geocoding?
When someone talks about geocoding data, they are likely in or related to the field of Geographic Information Systems or Sciences (GIS). A GIS person might work with tabular data from an Excel spreadsheet, CSV, or JSON format where they may have many addresses associated with their data and need to bring this into a GIS to be analyzed and visualized on a map.
While geocoding is related to the field of GIS, anyone can geocode addresses using services such as Google, Bing, ArcGIS, or HERE among many other applications and programming libraries. These services often provide an API (Application Programming Interface), or a method of connecting to these services, so that you can access their functionality from outside of their application platform. You can make use of these APIs either through programming languages or programs like FME that allow you to connect without coding.
What are some obstacles to geocoding?
Common obstacles for geocoding are usually about data quality. This can be caused by incorrect and inconsistent data entry, duplicate records, or incomplete addresses, which can result in unmatched point locations for the given coded location. Because of this, it’s a common practice to clean up and validate data before geocoding for best results. We’ll cover how to do this below.
Reverse Geocoding: Converting Coordinates to Addresses
Reverse geocoding takes an X,Y coordinate in the world and returns you the address information. Simply put, it is geocoding in the reverse direction. For example, we might input 49.17800358728969, -122.8422786350665 into the Geocoder transformer in FME and the closest result might be 9639 137A Street, Surrey BC.
A screenshot of FME Workbench’s Visual Previewer outlining the inputs and outputs of the Geocoder transformer when set to Reverse Geocoding.
Georeferencing: Overlaying Data on a Map
Georeferencing is the process of taking raster imagery such as satellite images, or vector data such as CAD drawings, and placing them in the correct place on the Earth using a geographic coordinate system, with the correct scale and orientation to north applied.
A georeferenced raster orthoimage of the False Creek area in Vancouver against a background map.
What kind of data can be georeferenced?
Both raster data (like pictures that you can zoom in on until you see pixels), and vector data (like points, lines, and polygon geometry) can be data that needs to be georeferenced. Bringing spatial data into the correct location on a map can add value to data by providing real-world context. Try imagining how useful a dataset of bus stops would be without any roads!
Georeferenced CAD parcels against a background map to show that it’s in the correct location.
How can data be georeferenced?
How to georeference a file depends on what kind of data you are working with, and what kind of coordinate system information you might be starting with. It might be that the file already has a location but needs some adjustments for accuracy, or perhaps the file has no geographic spatial awareness at all, such as a floor plan, and you need to first pull it to the correct location.
You can think of georeferencing like sewing a drawing onto the correct place on the map. Let’s say this drawing is a floor plan of a building. If we know a control point, say the bottom left of the building both on the drawing and the coordinates in a geographic coordinate system, we can essentially push a needle through the drawing and the map in the same location and pull the drawing into a known location. What’s left would be to check that the rest of the drawing is aligned, such that the building is oriented correctly towards north through rotation and that the scale makes sense. This is just one possible way to georeference a dataset.
By using GIS applications like ArcGIS or QGIS, or complete spatial data integration solutions like FME, there are lots of tools that can be used to adjust data to the correct location and orientation in a geographic coordinate system. Other methods could include working with existing surveyed points (control points where the location in the drawing matches a corresponding location in a geographic coordinate system), or warping data based on control points through a process called rubbersheeting. Find out more about georeferencing data in our blog post: How to Defeat Common Barriers to Georeferencing.
How do I know if a file is georeferenced?
Depending on the file format and the method, this coordinate system information can be saved directly in the file or as a companion file such as a projection file (.prj) or a world file (.wld). These are also sometimes referred to as a side-car file. If there aren’t any side-car files present, another way is to open up the file in its native program given that it’s spatially enabled, or to inspect the file in FME’s Data Inspector or Visual Preview and look to see if there’s a coordinate system listed in the Feature Information Window.
To check whether your data has a coordinate system in FME’s Visual Previewer in the Feature Information Window. Step 1 is to select the Feature Information icon in the Visual Preview, and that will open a dialog that shows the Coordinate System.
What’s the difference between Geocoding and Georeferencing?
- The type of data used:
Geocoding typically creates point data from names or addresses, while georeferencing works with spatial data to begin with (satellite images, CAD drawings… etc) that needs to be oriented on a map - The level of complexity:
The average Google Maps user is likely to run into geocoding without knowing it, and the process of geocoding is much like using a lookup table to match place names to geographic coordinates.
Georeferencing can be more complicated. The average Google Maps user may only use georeferenced data as opposed to doing the georeferencing themselves. Unlike geocoding, georeferencing can be done using a few different methods and the method is dependent on the data being georeferenced and what information is available
Term | Definition | Example Scenario |
Geocoding | The process of converting addresses and location names to X, Y point locations on the Earth | Translating a list of addresses from an Excel spreadsheet and getting back a point dataset of all the locations. |
Reverse Geocoding | The process of converting X,Y points in the real world into to addresses | Taking a point dataset and converting the coordinates into their nearest corresponding address. |
Georeferencing | The process of taking raster imagery such as satellite images, or vector data such as CAD drawings, and placing them in the correct place on the Earth using a geographic coordinate system, with the correct scale and orientation to north applied. | Taking a satellite image or vector dataset and putting it in the correct location based on known control points between the dataset and a geographic coordinate system. The result is data correctly located on a map. |
Ground Control Points | A known point in the source dataset that matches a corresponding point in a geographic coordinate system | A surveyor might take the GPS location of a point of interest and add the coordinates into a drawing that will later be drafted into a CAD file. This point in the drawing then corresponds to one in the real world. This is a ground control point. |
How can FME solve my Geocoding or Georeferencing problems?
To geocode data, FME has a transformer called the Geocoder that allows you to leverage different services to convert your addresses into latitude and longitude coordinates. Check out this step by step tutorial on Geocoding Addresses to see this transformer in action.
Geocoder transformer in FME Workbench.
Some Geocoding Service options in the Geocoder Parameters.
FME can also play a huge role in cleaning up and validating the data before geocoding. These processes might look like filtering out missing entries or duplicates, checking for consistency in naming, or that each address is complete with a street number, street name and correcting these before geocoding for a better result. Learn more about how you can clean up your attribute data in the tutorial: Validate your Data’s Attributes with the AttributeValidator Transformer
To georeference data, FME has a few different transformers for different scenarios. Below are a few examples of transformers that you can use in FME when you have:
- Raster data: RasterGeoferencer
- Source data with no spatial information: LocalCoordinateSystemSetter
- Source data that is approximately in the right area but needs adjustment: Affiner, Offsetter, Scaler, Rotator
- Source data with a few known ground control points: Affiner, Rubbersheeter
I hope this clarifies the difference between geocoding or georeferencing and helps you get started with some useful FME transformers. To try these out, get started today!