Page 153 - Six Sigma Advanced Tools for Black Belts and Master Black Belts
P. 153

2:57
                               Char Count= 0
          August 31, 2006
 JWBK119-10
        138       Process Capability Analysis for Non-Normal Data with MINITAB
          Least squares is a mathematical optimization technique which attempts to find a
        function which closely approximates the given data. It is done by minimizing the sum
        of the squares of the error (also known as residuals) between points generated by the
        function and corresponding points in the data.
          Maximum likelihood estimation is a statistical method used to make inferences
        about parameters of the underlying probability distribution of a given data set. Max-
        imum likelihood estimates of the parameters are calculated by maximizing the likeli-
        hood function with respect to the parameters. The likelihood function describes, for
        each set of distribution parameters, the chance that the true distribution has those
        parameters based on the sample data.
          The Newton--Raphson algorithm can be used to calculate maximum likelihood
        estimates of the parameters that define the distribution. It is a recursive method for
        computing the maximum of a function.


        10.2.2.3 Selecting the best-fit distribution
        Selection of the best-fit distribution can be done either qualitatively (by seeing how
        well the data points fit the straight line in the probability plot), quantitatively (using
        goodness-of-fit statistics), or by a combination of the two. Most statistical programs
        provide the plots and the statistics together.

        Probability plot
        The probability plot is a graphical technique for assessing whether or not a data set
                                                              2
        follows a given distribution such as the normal or Weibull. The data are plotted
        against a theoretical distribution in such a way that the points should approximately
        form a straight line. Departures from this straight line indicate departures from the
        specified distribution.
          The probability plot provided by MINITAB includes the following:


           plotted points, which are the estimated percentiles for corresponding probabilities
          of an ordered data set;
           fitted line, which is the expected percentile from the distribution based on maximum
          likelihood parameter estimates;
           confidence intervals, which are the confidence intervals for the percentiles.
          Because the plotted points do not depend on any distribution, they are the same
        (before being transformed) for any probability plot made. The fitted line, however,
        differs depending on the parametric distribution chosen. So you can use a probability
        plot to assess whether a particular distribution fits your data. In general, the closer
        the points fall to the fitted line, the better the fit.


        Anderson--Darling statistic
        The Anderson--Darling statistic was mentioned in Section 10.2.1.2. Note that for a
        given distribution, the Anderson--Darling statistic may be multiplied by a constant
        (which usually depends on the sample size, n). This is the ‘adjusted Anderson--
        Darling’ statistic that MINITAB uses. The p-values are based on the table given by
   148   149   150   151   152   153   154   155   156   157   158