Active Metadata Platforms for Spatial Data and XML
Metadata for spatial data and XML-based formats includes useful information like what kind of structures, geometry, and attributes are being stored, as well as the coordinate system, extents, modification date, quality, ownership, and more. This can help at both an admin level and an end-user level, and can answer questions like:
- What does this data represent?
- Who created this data and why?
- What is the quality and accuracy of this dataset?
- Is this data out of date?
- When is the best time to run a data integration workflow?
- What applications is this data useful for?
Managing metadata is traditionally a process of storing and cataloging information about a company’s datasets. Data governance teams and software help manage the people and processes around a company’s data.
Here’s where the landscape is changing.
On July 27, Gartner released a Market Guide for Active Metadata Management. This new active metadata report comes because the traditional approach described above is no longer enough. On the same day, Forbes published an article about why metadata automation is essential, because manual metadata management and catalogs are no longer sufficient.
What is active metadata and why do I need it?
Compared to a traditional metadata inventory, active metadata is about creating an automated platform that lets you gather insight and make business decisions. It’s a richer way of managing metadata that facilitates analysis and insight generation—much more profitable and time-saving than a basic metadata catalog.
For end users, better metadata means better data quality, as well as more descriptive information surrounding the data that can help drive decisions. Imagine maintaining a data repository that empowers end users to create complex, sophisticated applications—for example, a disaster response workflow that can dynamically get the latest data in a specific location with high accuracy.
An active metadata platform involves a few components:
- Automation. Collect, store, analyze, and report on the metadata continuously.
- Intelligent analysis. Create workflows to analyze and gather insights from the metadata. This can include applying machine learning to it to help manage it, as well as using it to help train a machine learning algorithm.
- Quality control. Automatically identify and fix metadata quality issues.
How do I make metadata “active”?
FME is the data integration platform with the best support for spatial, and this is where you can create workflows to read, write, update, harvest, validate, integrate, and automate metadata. As a data integration platform, the visual workflow environment and breadth of format support make it well suited for managing metadata in a sophisticated, automated way. Workflows can connect to sources like CAD, GIS, BIM, rasters, XML-based formats, and many other data types.
FME is great at reading, writing, and processing XML, which makes working with XML metadata easy. Learn more in the XMLTemplater and the XSD XML Reader/Writer documentation.
Data integration workflows make it possible to manage large volumes of metadata without tedious manual intervention. This is especially useful when a lot of data is collected or it gets frequent updates. For example, the Barnsley Metropolitan Borough Council used FME to migrate 10+ years of data and complex metadata into Microsoft SharePoint. The Alabama Power Company also uses FME for UAV/drone workflows. UAVs generate a lot of metadata, and FME is able to automatically extract and consolidate it during the post-processing phase.
UAVs collect a lot of data and metadata. Here, FME is used to georeference photos and display all of the collected data in Google Earth.
Using data integration, you can also perform quality checks and connect to machine learning algorithms or other platforms for further analysis. Identifying and fixing bad metadata can be done using transformers like the AttributeValidator. For example, Environment and Climate Change Canada (ECCC) uses FME to harvest, transform, and publish open data, which includes a metadata catalog. The workflow involves important quality control steps and runs automatically to ensure thousands of datasets stay up to date. FME makes it possible to perform quality control on a large amount of data and keep it up to date.
Metadata can be handled using powerful transformers and functionality related to schema, like the SchemaScanner.
In FME Server, Automations are used to keep metadata up to date and run workspaces in response to an event, like when a database gets updated, new information is received, or on a schedule. FME Server has its own metadata that can be used to optimize when and how jobs are run – see Analyzing Job Statistics in FME Server to learn more.
This Automation triggers an FME Workspace and sends an email on a schedule.
With an active metadata platform in place, metadata can be automatically gathered, analyzed, kept up to date, and checked for quality. This frees your team to make decisions and take action, and results in better quality data for both your enterprise and end users.
To learn more about how to manage metadata using FME, sign up for our Metadata webinar. Get started with your active metadata platform by downloading a free trial of FME.