How to create and maintain data integrity

Amanda Schrack

April 29, 2021•7 min

Learn what data integrity is, why it’s important, and how to implement data integrity practices into your organization.

Do you remember the game of telephone from your childhood? Maybe you recall it as the pass-the-message game. It’s where one person starts by whispering a message to a second person, who then whispers the message to a third person, and so on. When the message reaches the final person, they say it out loud to the group, and it’s usually a message that’s drastically different from the original message.

Why does the original message change over time as it goes from person-to-person? It could be because it’s difficult to clearly understand someone who’s whispering, or maybe it’s just difficult to perfectly recall the entire message you heard when repeating it to the next person. The point is that when many people are involved in a process, unintentional changes can happen.

Like messages in the telephone game, data can also become altered when passed through the hands of different people. But don’t worry. By prioritizing data integrity and good data management, data can be made accessible, accurate, and usable.

What is Data Integrity?

Organizations spend a lot of money on data management, but data is only valuable if it’s being used to inform decision-making. This is why prioritizing data integrity is so important.

Data integrity is the practice of keeping data accurate, reliable, and consistent throughout its lifecycle. This means keeping data authoritative, useful, and accessible. Data integrity is important for anyone who handles data but especially for those who work with large amounts of changing data.

When data integrity is practiced, you’ll see things like:

Consistency across all systems using the same datasets
Users being able to access data when and where they need to
Datasets being integrated and transformed to make business decisions

Data integrity is also important from a compliance and policy perspective. Many organizations abide by data protection policies such as the General Data Protection Regulation (GDPR), where failure to practice certain standards in data management processes could result in heavy financial penalties.

It’s important to note that data integrity is not the same everywhere and it differs from organization to organization. For example, GIS departments maintain data integrity by ensuring their spatial data is properly stored, projected, and integrated with their spatial software. For those using data for business intelligence, maintaining data integrity will mean validating, analyzing, and visualizing data in ways that are most effective for making business decisions.

Mitigating Risks & Data Corruption

The opposite of data integrity is data corruption — something that no one wants or ever wants to experience. When your data is corrupted, any decisions or conclusions you draw will be incorrect and in many cases, harmful.

To mitigate risks that may lead to data corruption, it’s important to be aware of the types of errors or problems that can occur when it comes to your data.

Hardware failures occur when the physical servers or computers fail to operate as needed due to a lack of physical integrity. These failures may be due to random causes like overheating, building damage, or break-ins, but can often be avoided. Do this by storing hardware in climate-controlled environments in buildings that are to code and taking proper security precautions.

Software oversights can include using multiple software that are incompatible with each other, not updating software (making it prone to bugs and security mishaps), or using software in ways it’s not intended for. Save yourself from potential data errors and increase your logical integrity by using tools that integrate data and platforms and modernizing your data management systems.

Human blunders are going to happen anywhere you go. These errors are caused by small mistakes people make, whether it be a typo, a lack of familiarity with a tool, or miscommunication. Avoid these errors by having well-documented guidelines and checking in with anyone using or collecting data.

With all this in mind, data integrity can be summarized as preventing unintentional changes to data.

Best Practices

Here are some of the best ways to implement and maintain data integrity in your organization.

Identify What You’re Working With

Data comes in many forms and there’s usually lots of it. Especially when you’re using various datasets for different purposes across multiple platforms. Before addressing data integrity as a whole, start by asking yourself these key questions:

What kind of data formats are you working with?
What software do you have access to?
What teams need to be able to access your different datasets?
Can you easily move data between apps, web services, databases and file formats?

Specify Data Validation Rules

You want to validate and clean your data early on to set yourself up for success. Otherwise, you may be using inaccurate information leading you towards inaccurate conclusions. “Garbage in, garbage out” as they say.

Once you’ve identified what paramenters you’ll need to keep your data standardized and validated, select tools that can help you achieve these outputs automatically. With a data integration platform like FME, you can build data validation workflows that automatically run as soon as new data is added to your systems. It’s quick, simple, and effective.

Solidify a Data Integration Plan

top-notch, you’ll need to create a data integration plan.

Data integration is the process of bringing data from multiple sources together and removing data silos. It’s an essential process when you need to remove barriers that exist whether that means incompatibility or needing to transform data to meet requirements.

FME is a great platform to use within your data integration plan. With it, you’ll be able to create custom workflows that convert and transform your data, and you can connect to any number of applications or web services. Best of all, it’s a no-code interface.

Automate Processes

While everything that’s been mentioned is incredibly useful for data integrity, it’s even better when you can trust systems to maintain data integrity for you. That means automating your processes.

Use FME to automate processes as soon as data has been entered, changed, or requested. Or, run workflows on a schedule to check for errors. This is a great way to mitigate human errors and it’ll save you much more time.

Stay Flexible

Creating and maintaining data integrity is an ongoing process, and that’s okay! It’s best to view it as more of a mindset than a task that needs to be accomplished.

For this reason, it’s important to ensure your tools can stay flexible, too. Since FME is a no-code platform, it’s easy to make changes to your workflow to adapt to change.