Page 56 - Clinical Small Animal Internal Medicine
P. 56
24 Section 1 Evaluation and Management of the Patient
was doomed to die from the illness regardless of treat- and for any statistical test performed there was either
VetBooks.ir ment, and the other would have inevitably recovered sufficient power (i.e., p ≤ α) or not (i.e., p > α) for a par-
ticular significance test.
even if treatment had been withheld; thus the correct
The above examples present clinical trial outcomes as
(but unknown) interpretation would be that treatment
could not have any effect on survival. Clearly, the two means and proportions. However, it is not uncommon
groups, composed of one individual each, are not com- in clinical trials to instead use interval‐scaled ordinal
parable in the sense that their outcomes (death versus variables, such as pain or severity indices, when quanti-
survival) would differ even if both received the placebo. tative measurements are unavailable. These variables
But if treatment had been randomized, the study would generally require nonparametric statistical analyses for
remain invariably confounded, leading to incorrect con- group comparisons, as well as for sample size and power
clusions. If the doomed individual received the treat- calculations, rather than ones that assume that the
ment, then without knowing the individual was doomed group outcomes follow a normal (Gaussian) distribu-
the inference would erroneously be that treatment tion. Nonparametric methods are invaluable for clinical
resulted in the patient’s death. Conversely, if the recov- research, particularly when sample sizes are small and
ered individual received the placebo, the incorrect con- underlying distributional assumptions are tenuous (as
clusion would be that treatment prompted a recovery. with interval‐scaled data), and are discussed in more
This example underscores an important characteristic detail in Chapter 2.
and drawback of randomization: its effectiveness in
establishing group comparability is proportional to the Cross‐over Trials
study size. In small studies, the possibility of group In studies of medical interventions with transient effects,
imbalances of characteristics or risk factors that could there can be a distinct advantage of using each patient as
influence health outcomes cannot be discounted, but her or his own control. Earlier, when groups composed
fortunately declines as study size increases. Sample size of randomized individuals were compared, there was
calculations therefore take on importance when under- always an assumption that they were similar with respect
taking a randomized study, and would be expected in any to the distribution of unknown or unmeasurable varia-
study proposal seeking approval from an institutional bles (confounders) that could influence the outcome.
review board. This assumption becomes tenuous when group‐specific
Sample size estimation for randomized studies is not sample sizes are small or there is so much variability in
trivial, however, and it is a fallacy to believe that there is these factors that the groups would still not be compara-
a universal sample size that is adequate for all clinical tri- ble. In this case, sequentially administering different
als. For example, information required to perform a sam- treatments to the same individual allows them to “cross
ple size calculation to determine if two group means over” between exposure states, permitting within‐indi-
significantly differ requires an investigator to know a vidual comparisons and controlling for confounding by
priori the type I (α) and type II (β) error percentages (the intrinsic patient characteristics.
latter is one minus the “study power”), whether the test This approach requires two important assumptions.
will be one‐ or two‐tailed, the standard deviations of the One is “temporal stability”: that time itself is not a deter-
outcome measurements in both groups, and the mini- minant of the study outcome. It implies that over the
mum difference in means the investigator believes is study period, the incidence of the outcome remains con-
medically important enough to warrant a finding of sta- stant in the absence of any treatments. The second is
tistical significance (the null hypothesis is that the means “causal transience”: that order of treatment administered
are equal, or equivalently the difference in means is zero). (e.g., an experimental drug versus placebo) is irrelevant
If instead the goal is to determine if the proportions of because there are no carry‐over effects of either treat-
individuals in two groups that experience an outcome ment (i.e., the effects of both are transient), which implies
significantly differ, then the two expected proportions an eventual return to an original state. To make this
that are medically important enough to find significantly assumption more tenable, a “wash‐out” period is typically
different must alternatively be specified (in addition to α included between treatment options, being sufficiently
and β). Issues related to sampling, sample size, and power long to ensure that any effects have become dissipated.
can become complex [3], and separate calculations for Measurements are characteristically taken at multiple
these would ideally be undertaken for every independent times under the different treatments, allowing the assess-
hypothesis test in a study, with the necessary sample size ment of time effects, treatment effects, and the interac-
being the largest of all the ones calculated. Once a study tion between these two main effects. It is also possible to
is completed, however, there is little value or justification extend such studies to include evaluating the effects of
in performing post hoc sample size or power calculations subgroups of different individuals, such as those defined
because the sample size has already been determined, by age and sex categories.