Page 28 - AT
P. 28

A28    SCIENCE
             Wednesday 13 november 2019
             Sorry, wrong number: Statistical benchmark comes under fire



            By MALCOLM RITTER                                                                                                   —  The  term  "statistical  sig-
            NEW  YORK  (AP)  —  Earlier                                                                                         nificance"  sets  up  a  goal
            this  fall  Dr.  Scott  Solomon                                                                                     line for researchers, a clear
            presented  the  results  of  a                                                                                      measure of success or fail-
            huge  heart  drug  study  to                                                                                        ure.  That  means  research-
            an audience of fellow car-                                                                                          ers  can  try  a  little  bit  too
            diologists in Paris.                                                                                                hard to reach it. They may
            The  results  Solomon  was                                                                                          deliberately   game    the
            describing  looked  promis-                                                                                         system  to  get  an  accept-
            ing:  Patients  who  took  the                                                                                      able p-value, or just uncon-
            medication  had  a  lower                                                                                           sciously  choose  analytic
            rate  of  hospitalization  and                                                                                      methods  that  help,  Mc-
            death  than  patients  on  a                                                                                        Shane and Lazar said.
            different drug.                                                                                                     —  That  can  distort  the  ef-
            Then  he  showed  his  audi-                                                                                        fects not only of individual
            ence another number.                                                                                                experiments,  but  also  the
            "There  were  some  gasps,                                                                                          cumulative  results  of  stud-
            or  'Ooohs,'"  Solomon,  of                                                                                         ies on a given topic, so that
            Harvard's   Brigham    and                                                                                          overall a drug can look "a
            Women's Hospital, recalled                                                                                          lot better than it actually is,"
            recently.  "A  lot  of  people                                                                                      McShane said.
            were disappointed."           In this July 1, 1960 file photo, a chemist works in laboratory in Cambridge, Mass.    What  should  be  done  in-
            One investment analyst re-                                                                         Associated Press  stead?  Abolish  the  bright
            acted by reducing his fore-                                                                                         line  of  statistical  signifi-
            cast  for  peak  sales  of  the  in  a  calculation  that  pro-  Solutions in Research Trian-  any  kind  of  p-value  cutoff  cance, and just report the
            drug — by $1 billion.        duces  something  called  gle  Park,  N.C.,  and  Boston  in this way. And this year it  p-value  along  with  other
            What happened?               a  p-value.  Usually,  if  this  University wrote in 2016.  went further, declaring in a  analyses  to  give  a  more
            The  number  that  caused  produces a p-value of less  The  danger  is  both  that  special issue with 43 papers  comprehensive  outline  of
            the  gasps  was  0.059.  The  than .05, the study findings  a   potentially   beneficial  on the subject, "It is time to  what  the  test  result  may
            audience  was  looking  for  are  considered  significant.  medical  finding  can  be  stop using the term "statisti-  mean,  McShane  and  oth-
            something under 0.05.        If  not,  the  study  has  failed  ignored  because  a  study  cally significant' entirely."  ers say.
            What  it  meant  was  that  the test.                     doesn't reach statistical sig-  What's  the  problem?  Mc-  It may not be as clear-cut
            Solomon's  promising  results  Solomon's study just missed.  nificance, and a harmful or  Shane  and  others  list  sev-  as  a  simple  declaration
            had run afoul of a statistical  So  the  apparent  edge  his  fruitless  medical  practice  eral:                   of  significance  or  insignifi-
            concept  you  may  never  drug  was  showing  over  could be accepted simply  — P-value does not directly  cance,  but  "we'll  have  a
            have  heard  of:  statistical  the  other  medication  was  because it does, he said in  measure the likelihood that  better idea of what's going
            significance.  It's  an  all-or-  deemed  insignificant.  By  an email.                the outcome of an experi-    on," Lazar said. "I think it will
            nothing  thing.  Your  statis-  this  criterion  there  was  no  The p-value cutoff for signif-  ment just is a fluke. What it  be easier to weed out the
            tical  results  are  either  sig-  "real" difference.     icance  Is  "a  measure  that  really  represents  is  widely  bad work."
            nificant, meaning they are  Solomon believes the drug  has  gained  gatekeeper  misunderstood,  even  by  Not  everybody  buys  the
            reliable,  or  not  significant,  in  fact  produced  a  real  status  ...  not  only  for  pub-  scientists  and  some  statis-  idea  of  doing  away  with
            indicating  an  unaccept-    benefit and that a larger or  lication  but  for  people  to  ticians,  said  Nicole  Lazar,  statistical   significance.
            ably high chance that they  longer-lasting  study  could  take  your  results  seriously,"  a statistics professor at the  Prominent   Stanford   re-
            were just a fluke.           have  reached  statistical  says  Northwestern  Univer-   University of Georgia.       searcher Dr. John Ioannidis
            The  concept  has  been  significance.                    sity  statistician  Blake  Mc-  — Using a label of statistical  says  that  abolition  "could
            used for decades. It holds a  "I'm  not  crying  over  spilled  Shane.                 significance  "gives  more  promote  bias.  Irrefutable
            lot of sway over how scien-  milk,"  he  said.  "We  do  set  It's  no  wonder  that  a  stat-  certainty  that  is  actually  nonsense  would  rule."  Al-
            tific  results  are  appraised,  the  rules.  The  question  is,  istician, at a recent talk to  warranted," Lazar said. "We  though  he  agrees  that  a
            which  studies  get  pub-    is that the right way to go  journalists  about  the  issue  should  recognize  the  fact  p-value  standard  of  less
            lished, and what medicines  about it?"                    just  before  Halloween,  dis-  that  there  is  uncertainty  in  than  .05  is  weak  and  eas-
            make it to drugstores.       He's  not  alone  in  asking  played  a  slide  of  a  jack-  our findings."           ily abused, he believes sci-
            But  this  year  has  brought  that question.             o'-lantern  carved  with  this  —  The  traditional  cutoff  of  entists  should  use  a  more
            two  high-profile  calls  from  "It is a safe bet that people  sight, obviously terrifying to  0.05 is arbitrary.   stringent  p-value  or  other
            critics, including from inside  have  suffered  or  died  be-  anyone in science or medi-  —  Statistical  significance  statistical measure instead,
            the  arcane  world  of  statis-  cause  scientists  (and  edi-  cine: "P = .06."       does not necessarily mean  specified before the exper-
            tics, to get rid of it — in part  tors,  regulators,  journalists  McShane and others argue  "significant"  —  or  that  a  iment is performed.
            out of concern that it pre-  and others) have used sig-   that the importance of the  finding  is  important  practi-  McShane  said  that  al-
            maturely  dismisses  results  nificance  tests  to  interpret  p-value  threshold  is  unde-  cally  or  scientifically,  Lazar  though  calls  for  abolishing
            like Solomon's.              results," epidemiologist Ken-  served.  He  co-authored  a  says.  It  might  not  even  be  statistical significance have
            Significance  is  reflected  neth Rothman of RTI Health  call  to  abolish  the  notion  true: Solomon cites a large  been raised for years, there
                                                                      of  statistical  significance,  heart drug study that found  seems to be more momen-
                                                                      which was published in the  a  significant  treatment  ef-  tum lately.
                                                                      prestigious  journal  Nature  fect for patients born in Au-  "Maybe,"  he  said,  "it's  time
                                                                      this  year.  The  proposal  at-  gust but not July, obviously  to put the nail in the coffin
                                                                      tracted more than 800 co-    just a random fluctuation.   on this one for good."q
                                                                      signers.
                                                                      Even  the  American  Sta-
                                                                      tistical  Association,  which
                                                                      had  never  issued  any  for-
                                                                      mal  statement  on  specific
                                                                      statistical  practices,  came
                                                                      down hard in 2016 on using                                      /arubatoday/
   23   24   25   26   27   28   29   30   31   32