Page 180 - Six Sigma Advanced Tools for Black Belts and Master Black Belts
P. 180
OTE/SPH
OTE/SPH
August 31, 2006
2:57
Char Count= 0
JWBK119-11
Regression-based Approaches 165
The CDF of a normally distributed random variable x(x ∼ N(μ, σ)), in terms of the
standard normal CDF, φ(·), is given by
x − μ
F(x) = φ .
σ
Hence, the quantile function, x (i) , for the [F(x (i) )]th quantile corresponding to the ith
ordered observation (x (i) ) in terms of the standard normal CDF, (·), is given by
−1
x (i) = μ + [F(x i )]σ.
In addition, consequent to the rank-ordering of sample observations, the expected
value of ordered observations at each [F(x (i) )]th quantile can be approximated by
i − 0.5
−1
E(x (i) ; n) ≈ .
n
−1
Each ordered observation can then be plotted against the expected value or (·).
This should give a straight line with intercept μ and slope σ if the normal distribution
appears adequate for representing the data.
Here, MINITAB and Microsoft Excel are used to generate the normal probability
plots. However, such plots can be easily generated with normal probability plotting
papers or standard graph papers as already discussed.
11.5.1.3 Using MINITAB and Excel
To conclude this section, MINITAB and Excel were used to produce probability plots
for data set in Table 11.1. Figure 11.3 shows the probability plot generated by MINITAB
together with the Anderson--Darling statistic generated for this set of residuals; also
plotted are the corresponding confidence limits. As none of the residuals are outside
the 95 % confidence interval bands, the null hypothesis that the residuals come from a
normal distribution can thus be retained at the 5 % level of significance. In fact, based
2
on the A statistic, the level of significance can be as high as 25 % as shown by the
p-value in Figure 11.3.
−1
In Excel, the inverse of the standard normal CDF, (·), can be easily evaluated
using the NORMSINV function. A probability plot can then be generated as shown
in Figure 11.4. A best-fit trendline can be drawn through the data points using Excel
graphing tools. The intercept of this linear fit provides an estimate of the mean and
its slope provides an estimate of the standrad deviation.
11.5.2 Shapiro--Wilk Test
Giventheabovediscussionontheimportanceoflinearityinprobabilityplots,statistics
have been developed to measure such linearity. The Shapiro--Wilk statistic is one
of these. It is commonly used in GOF tests for normality and lognormality as its
underlying distribution based on the normal distribution.
The Shapiro--Wilk GOF test is relatively powerful GOF test for normality and is
usually recommended for cases with limited sample data. The following test statistic
is computed:
b 2
W = . (11.8)
S 2