Chapter 9: Eliminate the hidden data factory

*Data Quality is For Everyone*

In this Chapter, I’ll introduce you to a concept I call the hidden data factory. It will change the way you view your job.

I’ve already noted in Chapter 6, and you already knew, that all of us need data to do our work. Marketeers need data to evaluate how their campaigns are going, financial analysts to evaluate performance, and managers to make the best possible decisions. And none of us ever say, “give me average data.” No, we all want the best! But the unfortunate reality is that most data is laden with errors. The good news is you can make improvements, as discussed in Chapter 5.

When we find an error, we usually simply correct it and move on without much thought. The problem is that errors recur, again and again. And they will do so until we do something about it.

When you get right down to it, most of our jobs have two components: the job and dealing with data issues so we can do the job. Thus, a geologist in an oil company may spend half of her time correcting data errors so she can do her real job, which is finding oil. Or a nurse in a clinic may spend a third of his day tracking patients down because the reach number is wrong. The head of risk management in a bank recently told me that he reckons his staff spends three quarters of their time dealing with data issues and a quarter actually managing risk.

I call this work of dealing with data issues the “hidden data factory” and every company, department, and job is loaded with them.

HERE’S WHY THIS MATTERS TO YOU and you have to do. First, recognize those hidden data factories for what they are, which is non-value-added work that wastes your time and the company’s money.

Next, make it your mission to find and eliminate them. Once you start to look, they’re not so hard to find.

To make them go away, you have to attack them pro-actively. The secret is finding out where the data was created, why the error occurred, and then eliminating the root cause.

This approach works because eliminating a single root cause can eliminate thousands of future errors. And many root causes are stunningly simple. I find that the people who create the data often don’t know they are creating errors. Simply explaining your requirements empowers them. And you get better numbers. Similarly, many automated measurement devices have never been calibrated! Aim to attack these sorts of issues first.

This approach is easier and more powerful than what you’re doing now. But it does require a bit of out-of-the-box thinking. Plenty of people have grown intolerant of hidden data factories.. And so should you. It will change your job forever!

In case you have missed any of the previous chapters, you can find them all here.

“the Data Doc,” helps organizations chart courses to data-driven futures, with special emphasis on quality and data science.