Page 41 - Clinical Small Animal Internal Medicine
P. 41

2  Statistical Interpretation for Practitioners  9

                 First, it would be an error to state that there was “no   is more than just reflective of the magnitude of differ-
  VetBooks.ir  difference” between the groups. Clearly, there was a dif-  ences or associations and the chosen level of significance;
                                                                  it is also a function of the variance of the point estimates.
               ference (of 40 μ/L). It would, however, be correct to state
               that because the null hypothesis of no group difference
               was not rejected, there was no significant difference   Because such variances are inversely proportional to
                                                                  sample sizes, two studies with identical differences or
               between the groups, assuming the assumptions of the   associations can have completely different P‐values: the
               statistical model are correct (because (P = 0.10) > (α =   smaller study’s differences could be nonsignificant, while
               0.05)).                                            the larger study’s differences may be significant.
                 Second, the  P‐value does not provide a quantitative   Sixth, it directly follows that any group differences or
               assay of the probabilities that the null or alternative   associations can eventually be made statistically signifi-
               hypotheses are correct. Conventional (superiority)   cant if the study size becomes sufficiently large. To illus-
               hypothesis testing  is predicated  on the  veracity  of the   trate this, two random samples of 25 individuals each
               null hypothesis, and so does not provide any assessment   were created, one assuming blood hemoglobin was nor-
               of its truth. Instead, the P‐value addresses an entirely dif-  mally distributed with a mean of 15 g/dL and standard
               ferent question: how likely (probabilistically speaking) is   deviation of 2 g/dL, and the other with a mean and stand-
               it that one would observe differences at least as large as   ard deviation of 15.1 and 2 g/dL, respectively. No one
               the one found in the study (40 μ/L) when the null hypoth-  would seriously argue that a hemoglobin difference of
               esis is true? In other words, instead of the P‐value equal-  0.1 g/dL is clinically important, and indeed it is not sig-
               ing the probability that the null hypothesis is true given   nificant at α = 0.05 (P = 0.79). However, if the two groups
               the data observed in the study, it provides the probability   were constructed to have 2500 individuals each, with the
               of observing the difference in the data observed (or more   same means and standard deviations, this same clinically
               extreme) given the null hypothesis is true. It follows that   unimportant difference (0.1 g/dL) becomes statistically
               “large”  P‐values indicate substantial concordance with   significant (P = 0.023). This underscores an important
               the null hypothesis, while “small” P‐values indicate poor   distinguishing principle of statistical analysis: statistical
               concordance with the null hypothesis (presumably moti-  significance  is  distinctly  different  from  and  does  not
               vating its rejection in favor of its alternative).  imply medical importance. In a large enough study, even
                 Third, a P‐value is only numerically correct under a   trivial and unimportant differences can become statisti-
               particular statistical model. With the Student’s t‐test   cally significant; in a small study, differences that appear
               example, the underlying model assumes that the ALT   to be worthy of medical pursuit may be statistically insig-
               values in each group are approximately normally distrib-  nificant. In recognition of this principle, alternative
               uted. If the assumption of normality is violated, however,   methods of hypothesis testing have been developed that
               the  P‐value will no longer be correct; the greater the   instead of examining superiority of one group over
               departure of the data distribution from normality, the   another, evaluate whether groups are noninferior or
               more incorrect the P‐value will be.                equivalent based on a determination of what constitutes
                 Fourth, another assumption is that the study data are   an acceptably important difference or association [7].
               independent, meaning that knowing one individual’s   Finally,  each  decision  to  reject  or  not  reject  a  null
               ALT value does not allow the ability to predict another   hypothesis following hypothesis testing is prone to error.
               individual’s ALT value. In this example, such an assump-  What is often unappreciated is that the more tests that
               tion is reasonable when each individual contributes only   are performed, the greater the probability that at least
               one ALT value. However, when replicates from a single   one decision will be incorrect. In a clinical setting, this is
               individual are included, it is plausible to assume that   perhaps most manifest when performing clinical labora-
               knowing an individual’s value at one time can better pre-  tory testing panels to screen for hematologic or chemical
               dict the same individual’s value at another time than the   abnormalities in blood. Reference ranges for blood
               value from a different individual. Such a violation of the   parameters  are  typically  constructed  to  encompass
               data independence assumption leads to the estimation of   approximately 95% of normal animals, implying that 5%
               an incorrect  P‐value; typically, the use of correlated   of  normal  animals  will  have values  falling  outside  the
               (nonindependent or dependent) data underestimates P‐  ranges.
               values, and hence more type I statistical errors (improp-  When a reference range is appropriate for a patient’s
               erly rejecting the null hypothesis) arise.         age, sex, and any other factors that can influence a par-
                 Fifth, it is important to recognize that statistical analy-  ticular blood parameter, it is reassuring that a normal ani-
               sis is more than the analysis of single numbers (i.e., point   mal’s value will fall within the reference range 95% of the
               estimates, as in averages) in groups; instead, it is more   time. However, a universally accepted practice is to run
               correctly described as the analysis of variability. For this   laboratory panels simultaneously evaluating many
               reason, the presence or absence of statistical significance   parameters. Suppose, for example, that a blood chemistry
   36   37   38   39   40   41   42   43   44   45   46