Six Sigma

Dealing with Non Normal Data

By Ron Pereira Updated on May 18th, 2026

Originally published on June 15, 2007

Robin, over on the iSixSigma blog, had an interesting post regarding hypothesis testing. Specifically, the question posed was how to deal with non-normal data.

Typically, most Six Sigma practitioners are taught to use “non-parametric” tests, such as Mood’s Median Test, when dealing with non-normal data, rather than “parametric” tests such as ANOVA and the 2 Sample t-test.  I wanted to touch on this as I have some opinions to share.

The Technical Issue

The main question here is whether or not using parametric tests (ANOVA, etc.) with non-normal data will lead us in the wrong direction.  One (not the only) underlying issue concerns the use of the standard deviation in the calculations.  If, for example, our data is skewed, using the median is recommended, which affects the measure of dispersion we should use (i.e., the standard deviation).  Why?  Let me use an example to explain.

Rich Dad Messes Things Up

In most neighborhoods, the super-wealthy folks at the end of the subdivision are not representative of the rest of us regular Joes.  So their $1.5 million homes can really skew the neighborhood’s mean (average) home price. This, in turn, may influence the standard deviation calculation (which uses the mean in its formula) in a misleading manner. And since the mean and standard deviation are used in most parametric tests, the issues begin to really compound.  Statistics are sort of like dominoes, I suppose.

What to do?

So what, you may ask, is a person to do when faced with non-normal data?  My personal approach is to study the data using both parametric and non-parametric tests. The funny thing is that, in most cases, the test results tell the same story.

So, instead of debating and studying the mind-numbing statistics books on my desk, I choose to be as speedy as possible while still ensuring I am confident in my conclusion.  So the extra 45 seconds it takes me to run both tests is much better than debating and wondering what to do.

Don’t get carried away

Six Sigma is often criticized for its analysis paralysis approach to problem-solving.  Hypothesis testing is powerful and should be used by all continuous improvement practitioners, lean and Six Sigma alike. But with that said… it’s these long, drawn-out debates, such as which test to use in certain situations, that frustrate people. So my advice is to stop debating and do both tests… then get back to the gemba, or the place the work is done, and make something else better!


  1. Rob

    June 16, 2007 - 1:23 am
    Reply

    This article [http://tinyurl.com/yrprzg] suggests four “solutions” to handling non-normal data:
    * Sub group averaging
    * Segmenting data
    * Transforming data
    * Using different distributions
    * Non-parametric statistics

    I tend to fit the distribution first, then try to transform the data then finally get into non-parametric statistics. However, I agree with you, sometimes I just run the numbers and use practical understanding of the process as well.

Have something to say?

Leave your comment and let's talk!

Start your improvement training today.