Page 169 - Six Sigma Advanced Tools for Black Belts and Master Black Belts
P. 169

OTE/SPH
 OTE/SPH
         August 31, 2006
                               Char Count= 0
 JWBK119-11
        154              2:57  Goodness-of-Fit Tests for Normality
        statistical models can range from theoretical parametric distributional models to more
        empirical nonparametric or distribution-free models. Apart from the nature of the
        processes which generate the data, the ‘appropriateness’ of a statistical model is in-
        advertently also a function of the data collection and analysis processes. The decision
        on whether a statistical model is appropriate for a particular data set is typically
        underpinned by the three fundamental considerations of alignment with theoretical
        process assumptions, robustness to departures from these assumptions, and down-
        stream data-analysis procedures. On top of these considerations, the model has to be
        judged on how well it represents the actual data. In order to achieve this, an entire
        class of statistical techniques known as ‘goodness-of-fit’ (GOF) tests has been devel-
        oped. Some of the more popular GOF techniques for assessing the adequacy of the
        normal distribution in representing the data are reviewed in this chapter. Such GOF
        tests of normality are commonly encountered in Six Sigma applications as many Six
        Sigma statistical techniques rely on the normality assumption.
          The fundamental statistical hypothesis testing concepts underlying GOF tests are
        discussed in Section 11.2 as a precursor to setting the correct framework for appro-
        priate applications of these tests. This is followed by a discussion of several popular
        GOF tests. The basic concepts are presented together with the pros and cons associ-
        ated with each of these tests. In order to aid understanding, the application procedure
        is discussed through practical examples for all these tests.



           11.2 UNDERLYING PRINCIPLES OF GOODNESS-OF-FIT TESTS

        GOF tests were developed primarily from fundamental concepts in statistical hypoth-
                                                 3
        esis testing attributed to Neyman and Pearson. In statistical hypothesis testing, there
        is always a statement of a ‘null’ hypothesis and an ‘alternative’ hypothesis which are
        sets of mutually exclusive possibilities in a sample space. For a typical statistical GOF
        test these are defined as follows:

          H 0 : F(x) = F 0 (x)vs.  H 1 : F(x)  = F 0 (x)

        where F 0 (x) is some hypothesized distribution function. In Six Sigma applications
        this is usually taken to be the normal distribution. To this end, it must be stressed
        that our basic intent here is to limit our risk against severe departure from normality.
        Generally, the primary aim is not in claiming that a particular hypothesized model,
        represented by the distribution function, F 0 (x), is proven to be representative of the
        real data, but to warn us of significant departure from F 0 (x). It should also be noted
        that as “all models are wrong’’, any pre-conceived F 0 (x) is always open to rejection as
        more information becomes available (i.e. as sample size increases).
          GOF tests fall naturally into four broad categories: (1) methods based on discrete
                                     2
        classification of data (Pearson χ ); (2) empirical distribution function (EDF) based
                                     P
        methods; (3) regression based methods; and (4) methods based on sample moments.
        Tests of each type are discussed in Sections 11.3--11.7. An example is used to demon-
        strate the application of each test. Finally, the power of these tests for the data set in the
        example is compared.
   164   165   166   167   168   169   170   171   172   173   174