Page 56 - Clinical Small Animal Internal Medicine
P. 56

24  Section 1  Evaluation and Management of the Patient

            was doomed to die from the illness regardless of treat-  and  for any  statistical  test performed  there was either
  VetBooks.ir  ment, and the other would have inevitably recovered   sufficient power (i.e., p ≤ α) or not (i.e., p > α) for a par-
                                                              ticular significance test.
            even if treatment had been withheld; thus the correct
                                                                The above examples present clinical trial outcomes as
            (but unknown) interpretation would be that treatment
            could not have any effect on survival. Clearly, the two   means and proportions. However, it is not uncommon
            groups, composed of one individual each, are not com-  in  clinical  trials  to  instead  use  interval‐scaled  ordinal
            parable in the sense that their outcomes (death versus   variables, such as pain or severity indices, when quanti-
            survival) would differ even if both received the placebo.   tative  measurements  are  unavailable.  These  variables
            But if treatment had been randomized, the study would   generally require nonparametric statistical analyses for
            remain invariably confounded, leading to incorrect con-  group comparisons, as well as for sample size and power
            clusions. If the doomed individual received the treat-  calculations, rather than ones that assume that the
            ment, then without knowing the individual was doomed   group outcomes follow a normal (Gaussian) distribu-
            the inference would erroneously be that treatment   tion. Nonparametric methods are invaluable for clinical
            resulted in the patient’s death. Conversely, if the recov-  research, particularly when sample sizes are small and
            ered individual received the placebo, the incorrect con-  underlying  distributional  assumptions  are  tenuous  (as
            clusion would be that treatment prompted a recovery.  with  interval‐scaled  data),  and  are  discussed  in  more
             This example underscores an important characteristic   detail in Chapter 2.
            and drawback of randomization: its effectiveness in
            establishing group comparability is proportional to the   Cross‐over Trials
            study  size.  In small  studies, the possibility  of group   In studies of medical interventions with transient effects,
            imbalances of characteristics or risk factors that could   there can be a distinct advantage of using each patient as
            influence health outcomes cannot be discounted, but   her or his own control. Earlier, when groups composed
            fortunately declines as study size increases. Sample size   of randomized individuals were compared, there was
            calculations therefore take on importance when under-  always an assumption that they were similar with respect
            taking a randomized study, and would be expected in any   to the distribution of unknown or unmeasurable varia-
            study  proposal  seeking approval from  an institutional   bles (confounders) that could influence the outcome.
            review board.                                     This assumption becomes tenuous when group‐specific
             Sample size estimation for randomized studies is not   sample sizes are small or there is so much variability in
            trivial, however, and it is a fallacy to believe that there is   these factors that the groups would still not be compara-
            a universal sample size that is adequate for all clinical tri-  ble. In this case, sequentially administering different
            als. For example, information required to perform a sam-  treatments to the same individual allows them to “cross
            ple size calculation to determine if two group means   over” between exposure states, permitting within‐indi-
            significantly  differ  requires  an  investigator  to  know  a   vidual comparisons and controlling for confounding by
            priori the type I (α) and type II (β) error percentages (the   intrinsic patient characteristics.
            latter is one minus the “study power”), whether the test   This approach requires two important assumptions.
            will be one‐ or two‐tailed, the standard deviations of the   One is “temporal stability”: that time itself is not a deter-
            outcome measurements in both groups, and the mini-  minant  of the study outcome. It implies that over the
            mum difference in means the investigator believes is   study period, the incidence of the outcome remains con-
            medically important enough to warrant a finding of sta-  stant in the absence of any treatments. The second is
            tistical significance (the null hypothesis is that the means   “causal transience”: that order of treatment administered
            are equal, or equivalently the difference in means is zero).   (e.g., an experimental drug versus placebo) is irrelevant
            If instead the goal is to determine if the proportions of   because there are no carry‐over effects of either treat-
            individuals in two groups that experience an outcome   ment (i.e., the effects of both are transient), which implies
            significantly differ, then the two expected proportions   an eventual return to an original state. To make this
            that are medically important enough to find significantly   assumption more tenable, a “wash‐out” period is typically
            different must alternatively be specified (in addition to α   included between treatment options, being sufficiently
            and β). Issues related to sampling, sample size, and power   long to ensure that any effects have become dissipated.
            can become complex [3], and separate calculations for   Measurements are characteristically taken at multiple
            these would ideally be undertaken for every independent   times under the different treatments, allowing the assess-
            hypothesis test in a study, with the necessary sample size   ment of time effects, treatment effects, and the interac-
            being the largest of all the ones calculated. Once a study   tion between these two main effects. It is also possible to
            is completed, however, there is little value or justification   extend such studies to include evaluating the effects of
            in performing post hoc sample size or power calculations   subgroups of different individuals, such as those defined
            because the sample size has already been determined,   by age and sex categories.
   51   52   53   54   55   56   57   58   59   60   61