data collection

When the Possible Trumps the Likely

By Jon Miller Updated on November 14th, 2016

Magnifying glass and documents with analytics data lying on table,selective focusMagnifying glass and documents with analytics data lying on table,selective focus

About a decade ago I was invited to help mentor the productivity and quality improvement teams at a large consumer electronics manufacturer. During the first visit there were three things that immediately struck me about this company.

First, they were very proud of the quality and reliability of their product. Prior to exploring lean manufacturing, this company had invested a lot in six sigma training, certification and projects.

The second thing that stood out was that these same proud engineers were nowhere to be found on the factory floor, unless they were with me on a gemba walk.

Third, for a company full of smart people who were by no means ignorant of one-piece flow, there were large quantities of work in process inventory in a lot of places. “Kanban,” I was informed.

The quality was inspected in, not built in, at nearly every process. The WIP existed because of a lack of flow. Various inspection points, several of which relied on automated and expensive equipment, were shared. One example was the x-ray system used to look inside the box for any missing items. The root causes of any mistakes found at that point were impossible to trace back through the batches of stock to the originating processes. Standing and observing the processes, the failure modes for missing components became obvious to me.

The engineers suffered from an over reliance on data. From their bird’s eye view, they couldn’t imagine their thoroughly-designed, professionally engineered, data-supported processes failing. From my ground-level view, I couldn’t imagine how they expected to build good products. When I invited them to join me on the gemba to confirm or deny their assumptions, we spent too much time looking at computer screens with real-time data and process behavior charts.

This approach had always worked for the company in the past. It had allowed them to become the industry leader over the previous 8 years. But when it stopped working, they didn’t have the habits and practices that led them to the source of the problem that were invisible in the data but blazingly obvious on the factory floor. The behavior that had worked for them was hiding behind the data, seeing what they wanted to see, and avoiding visits to the front lines. They were not market leaders by virtue of excellence. They had been lucky for a lack of strong competition. Sadly for them, that was not to last.

I was reminded of this experience this past week while reflecting on the surprise results of the U.S. presidential election. Neither candidate’s polling data predicted the winner. Polling involves collecting data by surveying a sample of people. This data is used to predict behavior of a larger population. As with all sampling, the larger the size, the more it represents the larger population, and generally the statistical error is lower. There is also an important assumption that the data collected is of a random sample that represents the larger population.Data doesn’t lie, but there is always human bias in data selection.

“In a poll of likely voters…” is a phrase that is loaded with room for bias and error. One thing we learned from this election is that there was more possibility in the “possible” than likelihood in the “likely”. This is counterintuitive because “likely” means more probably than not, while “possible” includes all levels of probability above zero. By definition, something that is likely is more probable than something that is possible. One of the factors in this election was a combination of likely past voters staying home, and unlikely first time voters making their voices heard. The way pollster define “likely” may have been revealed to be subjective and open to bias and error this past week. 

Popular statistician and author Nate Silver wrote, “Distinguishing the signal from the noise requires both scientific knowledge and self-knowledge.” Even he got the outcome of this election wrong. As with the pollsters, it was not because of a lack of scientific knowledge. Perhaps it was due to a failure of imagination, a failure of empathy or of understanding of realities on the front lines of data collection – the rural counties and small towns – and how this affected voter behavior. Data stripped bare of an understanding of how things work is useless. This is true in pollsters who fail to recognize their own biases or fail to see unlikely changes in voters, and true in sophisticated process controlled factories that ignore the error-compounding effects of batch-and-queue.

Have something to say?

Leave your comment and let's talk!

Start your Lean & Six Sigma training today.