Page 179 - Six Sigma Advanced Tools for Black Belts and Master Black Belts
P. 179
OTE/SPH
OTE/SPH
August 31, 2006
Char Count= 0
JWBK119-11
164 2:57 Goodness-of-Fit Tests for Normality
commonly available Microsoft Excel platform. The fundamental concepts underlying
this transformation are discussed first, followed by a worked example using Minitab
and Microsoft Excel to demonstrate the usefulness of these techniques.
11.5.1.1 Fundamental concepts in probability plotting
Probability plots are essentially plots of the quantile values against the corresponding
ranked observations (x (i) ). Hence, in general, they can be represented in the following
functional form:
x (i) = F −1 [p(x (i) ; n)].
Here, the probability value, p(x (i) ; n), for evaluating the quantile values can be esti-
mated from
i − 0.5
p(x (i) ; n) ≈ . (11.7)
n
The underlying principles of probability plotting are based on the expected value
7
of an ordered observation, E(x (i) ; n). Each ordered random observation in a sample
of size n corresponds to one such expected value. In a single sample of size n, each
ordered random observation x (i) is a one-sample estimate of E(x (i) ; n). Hence, when
each of these ordered random observations are plotted against its expected value, they
should approximately lie along a straight line through the origin with slope 1.
The expected value of an ordered observation in a sample is distribution-dependent.
For most distributions, the expected value of the ith observation can be estimated from
i − c
−1
E(x (i) ; n) ≈ F , c ∈ [0, 0.5].
n − 2c + 1
This is essentially the ((i − c)/(n − 2c + 1))th quantile of the distribution evaluated at
x (i) . The constant c is a function of both the hypothesized distribution and the sample
size. A value of c = 0.5 is generally acceptable for a wide variety of distributions
and sample sizes, giving the estimation shown in equation (11.7). For the uniform
distribution, c is taken to be zero and the expected value is given by
i
E(x (i) ; n) ≈ F −1 .
n + 1
11.5.1.2 Linearizing the CDF
In the absence of convenient probability plotting papers or statistical software offering
facilities for automated plotting, normal graph papers can be used in conjuction with
linearization of the hypothesized CDF. The CDFs of many common distributions
can be linearized by taking advantage of the structure of the quantile function. Such
linearization transforms the data and allows it to be plotted against the cumulative
percentage of observations or CDF (F 0 ) on normal graph papers. If the corresponding
plotted points fall roughly on a straight line, similar assessments that the data can be
adequately described by the normal distribution can be made.