Page 153 - Six Sigma Advanced Tools for Black Belts and Master Black Belts
P. 153
2:57
Char Count= 0
August 31, 2006
JWBK119-10
138 Process Capability Analysis for Non-Normal Data with MINITAB
Least squares is a mathematical optimization technique which attempts to find a
function which closely approximates the given data. It is done by minimizing the sum
of the squares of the error (also known as residuals) between points generated by the
function and corresponding points in the data.
Maximum likelihood estimation is a statistical method used to make inferences
about parameters of the underlying probability distribution of a given data set. Max-
imum likelihood estimates of the parameters are calculated by maximizing the likeli-
hood function with respect to the parameters. The likelihood function describes, for
each set of distribution parameters, the chance that the true distribution has those
parameters based on the sample data.
The Newton--Raphson algorithm can be used to calculate maximum likelihood
estimates of the parameters that define the distribution. It is a recursive method for
computing the maximum of a function.
10.2.2.3 Selecting the best-fit distribution
Selection of the best-fit distribution can be done either qualitatively (by seeing how
well the data points fit the straight line in the probability plot), quantitatively (using
goodness-of-fit statistics), or by a combination of the two. Most statistical programs
provide the plots and the statistics together.
Probability plot
The probability plot is a graphical technique for assessing whether or not a data set
2
follows a given distribution such as the normal or Weibull. The data are plotted
against a theoretical distribution in such a way that the points should approximately
form a straight line. Departures from this straight line indicate departures from the
specified distribution.
The probability plot provided by MINITAB includes the following:
plotted points, which are the estimated percentiles for corresponding probabilities
of an ordered data set;
fitted line, which is the expected percentile from the distribution based on maximum
likelihood parameter estimates;
confidence intervals, which are the confidence intervals for the percentiles.
Because the plotted points do not depend on any distribution, they are the same
(before being transformed) for any probability plot made. The fitted line, however,
differs depending on the parametric distribution chosen. So you can use a probability
plot to assess whether a particular distribution fits your data. In general, the closer
the points fall to the fitted line, the better the fit.
Anderson--Darling statistic
The Anderson--Darling statistic was mentioned in Section 10.2.1.2. Note that for a
given distribution, the Anderson--Darling statistic may be multiplied by a constant
(which usually depends on the sample size, n). This is the ‘adjusted Anderson--
Darling’ statistic that MINITAB uses. The p-values are based on the table given by