Page 199 - Six Sigma Advanced Tools for Black Belts and Master Black Belts
P. 199
OTE/SPH
OTE/SPH
August 31, 2006
JWBK119-12
Introduction to the Analysis of Categorical Data
184 2:58 Char Count= 0
Table 12.8 Logistic regression table for simple model with one explanatory variable.
95% Confidence
Limits
Predictor Coefficient SE Coef ∗ z-statistic p-value Odds ratio Lower Upper
Constant ˆ α =−19.43 6.09 −3.19 0.001
ˆ β = 0.77
x Width 0.23 3.25 0.001 2.15 1.36 3.42
Standard error of coefficient estimates.
∗
where x Width is the carapace width of a female horseshoe crab and π Pres (·) is the prob-
ability of finding a satellite crab nearby.
MINITAB offers the facility to analyze such simple logistic regression models and
the output from this analysis is reproduced in Table 12.8. The maximum likelihood
estimates(MLEs)ofthecoefficientsaregivenintheoutputas ˆα =−19.43and ˆ β = 0.77.
From Table 12.8, these coefficient estimates are found to be significant by looking at
the z statistics and p-values. The z statistics of the estimates are evaluated from:
ˆ α ˆ β
z α = and z β =
ASE (ˆα) ASE ˆ β
where, ASE (ˆα) is the asymptotic standard error of the MLE of parameter ˆα, and
similarly for ˆ β.
The hypothesis test conducted for these estimates is based on an asymptotic nor-
mal distribution at the 5% level of significance. The formal hypothesis test for the
significance of any parameter γ in the simple logistic regression model is stated as
follows:
H 0 : γ = 0vs. H 1 : γ = 0
2
Such a hypothesis test can also be conducted via the z statistic (or Wald statistic)
2
which follows a large sample χ distribution with 1 degree of freedom. Alternatively,
a likelihood ratio test can be conducted. The likelihood ratio test utilizes the ratio of
two maximum log likelihood functions. The two maximum log likelihood functions
are evaluated based on the maximized log likelihoods of a reduced model without
any explanatory variables (l 0 ) and the log likelihood of a full model (l 1 ) postulated
by the alternative hypothesis. For a simple logistic regression model with only two
parameters, α and β, l 0 denotes the maximized log likelihood when β = 0 and l 1
denotes the maximized likelihood with both α and β parameters present in the model.
The likelihood ratio statistic can be expressed as
2
G =−2ln l 0 . (12.16)
l 1
Under the null hypothesis, the likelihood ratio statistic for the logistic regression
2
model with one explanatory variable follows a large sample χ distribution with
p − 1 degrees of freedom, where p is the number of parameters estimated. For the
horseshoe crab study, the maximized log likelihoods for the full and reduced models
2
are −24.618 and −33.203, respectively. This gives a G statistic of 17.170. This is much