Page 179 - Six Sigma Advanced Tools for Black Belts and Master Black Belts
P. 179

OTE/SPH
 OTE/SPH
         August 31, 2006
                               Char Count= 0
 JWBK119-11
        164              2:57  Goodness-of-Fit Tests for Normality
        commonly available Microsoft Excel platform. The fundamental concepts underlying
        this transformation are discussed first, followed by a worked example using Minitab
        and Microsoft Excel to demonstrate the usefulness of these techniques.

        11.5.1.1  Fundamental concepts in probability plotting
        Probability plots are essentially plots of the quantile values against the corresponding
        ranked observations (x (i) ). Hence, in general, they can be represented in the following
        functional form:

          x (i) = F  −1 [p(x (i) ; n)].
        Here, the probability value, p(x (i) ; n), for evaluating the quantile values can be esti-
        mated from
                    i − 0.5
          p(x (i) ; n) ≈  .                                                   (11.7)
                      n
          The underlying principles of probability plotting are based on the expected value
                                        7
        of an ordered observation, E(x (i) ; n). Each ordered random observation in a sample
        of size n corresponds to one such expected value. In a single sample of size n, each
        ordered random observation x (i) is a one-sample estimate of E(x (i) ; n). Hence, when
        each of these ordered random observations are plotted against its expected value, they
        should approximately lie along a straight line through the origin with slope 1.
          The expected value of an ordered observation in a sample is distribution-dependent.
        For most distributions, the expected value of the ith observation can be estimated from


                            i − c
                      −1
          E(x (i) ; n) ≈ F          ,    c ∈ [0, 0.5].
                         n − 2c + 1
        This is essentially the ((i − c)/(n − 2c + 1))th quantile of the distribution evaluated at
        x (i) . The constant c is a function of both the hypothesized distribution and the sample
        size. A value of c = 0.5 is generally acceptable for a wide variety of distributions
        and sample sizes, giving the estimation shown in equation (11.7). For the uniform
        distribution, c is taken to be zero and the expected value is given by

                           i
          E(x (i) ; n) ≈ F  −1  .
                         n + 1



        11.5.1.2  Linearizing the CDF
        In the absence of convenient probability plotting papers or statistical software offering
        facilities for automated plotting, normal graph papers can be used in conjuction with
        linearization of the hypothesized CDF. The CDFs of many common distributions
        can be linearized by taking advantage of the structure of the quantile function. Such
        linearization transforms the data and allows it to be plotted against the cumulative
        percentage of observations or CDF (F 0 ) on normal graph papers. If the corresponding
        plotted points fall roughly on a straight line, similar assessments that the data can be
        adequately described by the normal distribution can be made.
   174   175   176   177   178   179   180   181   182   183   184