Page 193 - Six Sigma Advanced Tools for Black Belts and Master Black Belts
P. 193
OTE/SPH
OTE/SPH
August 31, 2006
JWBK119-12
Introduction to the Analysis of Categorical Data
178 2:58 Char Count= 0
Table 12.5 Statistics on sectoral employment of RSEs by level of qualifications.
Employment in
Level of Qualifications Percentage employed in
of RSEs Private Sector Public Sector private sector as RSE Total
PhD 781 3 282 21.5% 4 063
X 2 1 171 1 851
−62 62
R adj
Master’s 2 831 2 073 25.9% 4 904
X 2 10 16
R adj −6 6
Bachelor’s 7 984 1 984 52.6% 9 968
X 2 579 914
R adj 56 −56
Percentage of RSEs in 61.2% 38.8%
each sector
Total 11 596 7 339 18 935
Both sample percentages and adjusted residuals show that there is a possible negative
correlation between qualification level and probability of employment in the private
sector as an RSE.
2
As the M statistic is more sensitive to departures from the null hypothesis when or-
dinal information in the categorical data is accounted for, it is used here to statistically
assess the presence of relationships between the response and explanatory variables.
2
The ordinal M statistic requires scores for each level of the variables. In this case,
arbitrary equally spaced scores are assumed for each level of the variables. For the ex-
planatory variables based on the level of qualification, a score of v 1 = 1 is assumed for
RSEs with Bachelors qualifications, v 2 = 2 for RSEs with Masters qualifications and
v 3 = 3 for RSEs with PhD qualifications. Similarly, arbitrary equally spaced scores are
assumed for the two types of sectoral employment. Since the response variables only
have two levels, RSEs in private sector employment are assumed to be represented by
a score of u 1 = 1 and those in public sector employment are assumed to have a score of
u 1 = 0. Using this scoring system, the sample Pearson product-moment correlation,
2
r, is found to be −0.485 and the corresponding M statistic is found to be 4446. Since
2
this is very much higher than the critical value of the χ statistic having 1 degree of
freedom at any reasonable significance level, the null hypothesis of independence is
rejected with very strong evidence.
12.3.1 Sample proportions, relative risks, and odds ratio
Further statistical inference can be made on the degree of association between binary
variables in two-way contingency tables by comparing differences in the proportions
of total counts falling in each cell. Apart from this proportion of counts, two other
useful measures of association for two-way contingency tables are the relative risk