With the help of AI, tedious and manual data processing tasks have become more efficient and scalable than ever. At this year’s Peak of Data & AI conference, we had two standout presentations about applying AI to automate key parts of data workflows. Watch the webinar Peak of Data & AI Encore: AI for Metadata and Smarter Workflows to see these sessions, and read on for an overview.
Generating Synthetic Metadata with AI
Metadata has become more important than ever, acting as the semantic layer AI agents use to navigate organizational data environments. Without high-quality metadata, even the most capable AI models risk misinterpreting datasets—especially when faced with poor naming conventions, incomplete descriptions, or missing context.
In her talk, Grace Cai, Innovation Lead at Shell, discussed the persistent challenge of metadata maintenance. As data volumes grow and semantic requirements increase, maintaining clean, rich metadata has become a critical task, yet one that’s often tedious and quickly outdated.
Using FME and Azure, Grace and her team designed an automated workflow to generate metadata for ArcGIS Online feature services. The pipeline included:
-
Data Sampling: Reading sample records from each feature service.
-
Prompted AI Descriptions: Feeding attributes and sample values to GPT to generate short field descriptions.
-
Semantic Tagging: Using LLMs to classify fields (e.g., zip code, boolean, coordinate).
-
Table Summarization: Creating full dataset summaries, including where, what, and who details.
-
JSON Output: Structured results formatted for easy integration in FME.
They split the AI tasks into two roles: the “Data Detective” Agent for generating raw metadata, and the “Editor” Agent for refining output and incorporating company-specific acronyms.
Grace shared several helpful tips and takeaways, including:
-
Adding a single line to allow GPT to say “I don’t know” dramatically reduced hallucinations.
-
Crowdsourced evaluations were necessary because good human-written metadata often relied on external context that AI didn’t have.
-
Balancing token usage, model context windows, and prompt structure was essential to produce affordable, high-quality output.
Grace’s presentation demonstrated that AI workflows can offer immense value when it comes to business use cases like metadata generation. Be sure to check out her session for a wealth of practical tips, including which FME transformers she used and how she crafted her AI prompts.
FME AI Checker: FME Workspace quality checks
Alexandre Bijaye from Veremes shared his project, the FME AI Checker, an AI-powered tool for reviewing and annotating FME Workspaces. His goal was to provide FME users with automated quality feedback and optimization suggestions.
This tool was built using basic OpenAI GPT access and FME’s built-in HTTPCaller and JSON tools. It reads the FME Workspace’s XML structure and sends groups of Transformer definitions GPT using structured prompts. It has two modes of operation: Error Checking Mode identifies critical misconfigurations and missing parameters, while Annotation Mode adds human-readable comments to transformers, explaining their function.
The AI is able to perform checks like spotting reversed logic, detecting malformed JSON paths, and flagging geometry mismatches between transformers. It has difficulty in some areas, like domain-specific transformers (e.g., Voronoi, raster) and special configuration parameters, and Alex discussed potential future improvements such as including readers, writers, and parameters in the AI checks.
“Accuracy in AI is not just about getting the right answer; it’s about understanding the confidence and limits of that answer.” – Dr. Fei-Fei Li
Check out Alexandre’s presentation for a deep dive into his project, including the tool’s strengths and limitations, as well as technical tips and takeaways.
Both presentations highlighted the transformative potential of AI and FME. Whether you’re looking to automate metadata generation or improve FME Workspace quality, these tools show that AI can reduce manual work, increase accuracy, and help your data pipelines stay scalable and future-proof. While challenges remain, these projects show that with thoughtful prompt engineering and integration, AI is already delivering real value in the FME ecosystem.