Page 92 - Six Sigma Advanced Tools for Black Belts and Master Black Belts
P. 92
OTE/SPH
OTE/SPH
2:55
August 31, 2006
JWBK119-06
Char Count= 0
Process Variability 77
It is practically impossible to determine the true value of the population parameters
μ and σ via a finite sample of size n. Thus, sample statistics are used. Suppose that x 1 ,
x 2 ,..., x n are the observations in a sample. Then the variability of the process sample
data is measured by the sample variance,
n 2
2 i=1 (x i − ¯x)
s = ,
n
n
where ¯x is the sample mean given by ( n x i )/n. Note that the sample variance is
i=1
simply the sum of the squared deviations of each observation from the sample mean,
divided by the sample size. However, the sample variance defined is not an unbiased
2
estimator of the population variance σ . In order to obtain an unbiased estimator for
2
σ , it is necessary instead to define a ‘bias-corrected sample variance’,
n 2
2 i=1 (x i − ¯x)
s = .
n − 1
2
An intuitive way to see why s gives a biased estimator of the population variance is
n
that the true value of the population mean, μ, is almost never known, and so the sum
of the squared deviations about the sample mean ¯x must be used instead. However,
the observations x i tend to be closer to their sample mean than to the population mean.
Therefore, to compensate for this, n − 1 is used as the divisor rather than n.If n is used
as the divisor in the sample variance, we would obtain a measure of variability that
2
is, on the average, consistently smaller than the true population variance σ . Another
2
way to think about this is to consider the sample variance s as being based on n − 1
degrees of freedom since the sample mean is used instead.
If the individual observations are from the normal distribution, the sample-to-
2
sample randomness of s is explained through the following random variable:
(n − 1) s 2
2
χ = . (6.1)
σ 2
2
4
The random variable χ follows what is known as the chi-square distribution with
n − 1 degrees of freedom which is also its mean. Even though the derivation of this
statistic is based on the normality of the x variable, the results will hold approximately
as long as the departure from normality is not too severe.
6.2.1 The unbiased estimator
2
The sample mean ¯x and the biased-corrected sample variance s are unbiased estima-
2
tors of the population mean and variance μ and σ , respectively. That is,
2
2
E ¯x = μ and E(s ) = σ .
If there is no variability in the sample, then each sample observation x 1 = ¯x, and the
2
2
sample variance s = 0. Generally, the larger the sample variance s , the greater is the
variability in the sample data.
While the sample variance provides an unbiased estimation of the population vari-
ance, the positive square root of the sample variance, known as the sample standard
deviation and denoted by s, is a biased estimator of the population standard deviation