Page 180 - Six Sigma Advanced Tools for Black Belts and Master Black Belts
P. 180

OTE/SPH
 OTE/SPH
         August 31, 2006
                         2:57
                               Char Count= 0
 JWBK119-11
                               Regression-based Approaches                   165
        The CDF of a normally distributed random variable x(x ∼ N(μ, σ)), in terms of the
      standard normal CDF, φ(·), is given by

                  x − μ
        F(x) = φ         .
                    σ
      Hence, the quantile function, x (i) , for the [F(x (i) )]th quantile corresponding to the ith
      ordered observation (x (i) ) in terms of the standard normal CDF,  (·), is given by
                   −1
        x (i) = μ +   [F(x i )]σ.
      In addition, consequent to the rank-ordering of sample observations, the expected
      value of ordered observations at each [F(x (i) )]th quantile can be approximated by

                       i − 0.5
                    −1
        E(x (i) ; n) ≈        .
                         n
                                                                            −1
      Each ordered observation can then be plotted against the expected value or   (·).
      This should give a straight line with intercept μ and slope σ if the normal distribution
      appears adequate for representing the data.
        Here, MINITAB and Microsoft Excel are used to generate the normal probability
      plots. However, such plots can be easily generated with normal probability plotting
      papers or standard graph papers as already discussed.

      11.5.1.3 Using MINITAB and Excel
      To conclude this section, MINITAB and Excel were used to produce probability plots
      for data set in Table 11.1. Figure 11.3 shows the probability plot generated by MINITAB
      together with the Anderson--Darling statistic generated for this set of residuals; also
      plotted are the corresponding confidence limits. As none of the residuals are outside
      the 95 % confidence interval bands, the null hypothesis that the residuals come from a
      normal distribution can thus be retained at the 5 % level of significance. In fact, based
              2
      on the A statistic, the level of significance can be as high as 25 % as shown by the
      p-value in Figure 11.3.
                                                       −1
        In Excel, the inverse of the standard normal CDF,   (·), can be easily evaluated
      using the NORMSINV function. A probability plot can then be generated as shown
      in Figure 11.4. A best-fit trendline can be drawn through the data points using Excel
      graphing tools. The intercept of this linear fit provides an estimate of the mean and
      its slope provides an estimate of the standrad deviation.


      11.5.2 Shapiro--Wilk Test
      Giventheabovediscussionontheimportanceoflinearityinprobabilityplots,statistics
      have been developed to measure such linearity. The Shapiro--Wilk statistic is one
      of these. It is commonly used in GOF tests for normality and lognormality as its
      underlying distribution based on the normal distribution.
        The Shapiro--Wilk GOF test is relatively powerful GOF test for normality and is
      usually recommended for cases with limited sample data. The following test statistic
      is computed:
             b 2
        W =    .                                                            (11.8)
             S 2
   175   176   177   178   179   180   181   182   183   184   185