that to get a drug
approved you must
prove p < 0.05.” This
is starting to change,
however, according to Dr. Pocock,
as journals become
more sensitive to the
issue.
Dr. Wasserstein
noted that hypothesis testing and
—Stuart Pocock, MSc, PhD
inference is a tricky
concept. His hope is
simply that the most
recent brouhaha will
inspire people to
reject the use of a single index as a “substitute for
Dr. Gelman places much of the blame for this probscientific reasoning” and “see how we can make the
lem on statistical education (generously including
science better as we go into the post p < 0.05 era.”
his own courses and textbook), which reduces statistics to a mathematical and computation science,
How Do We p-Fix This?
rather than combining computation with a thought
process that emphasizes valid and reliable measureThe trend toward simplistically using and abusing
ment. With less time for thinking today, it is probp values is alluring to the many individuals in the
ably not surprising that a big white line separating
field of medicine who lack deep understanding of
‘yes’ from ‘no’ has gained prominence in reporting
statistics. Many physicians, researchers, journal
study results.
editors, and science writers possess a limited unBlame also belongs to the trend to sell statistics
derstanding of statistical sciences, and likely aren’t
“as a sort of alchemy that transmutes randomness
all that interested in learning more. After all, who
into certainty, an “uncertainty laundering” that
really understands confidence, credibility, or predic“begins with data and concludes with success as
tion intervals (alternative approaches to analysis
measured by statistical significance,” as Dr. Gelman
suggested by the ASA statement)? How are you
puts it. The culprits here are everyone: educators,
on Bayesian methods? (See sidebar interview with
journals, granting agencies, researchers, and jourstatistician Jan Tijssen, PhD, for his comments on
nalists. “Just try publishing a result with p = 0.20,”
how to improve data analysis.)
he noted.
“It is a mass re-education that is required to get
“If researchers have been trained with the expecp values seen as strength of evidence with no magic
tation that they will get statistical significance if they
cutoff,” said Dr. Pocock.
work hard and play by the rules, if granting agencies
To be fair, even the ASA acknowledges that
demand power analyses in which researchers must
the practice of statistics is not a straightforward
claim 80% certainty that they will attain statistical
endeavor, with clearly defined rules of engagement.
significance, and if that threshold is required for
The ASA team noted in their statement that finding
publication, it is no surprise that researchers will
points of agreement among the two dozen experts
routinely satisfy this criterion, and publish, and
gathe red “turned out to be relatively easy to do, but
publish, and publish, even in the absence of any real
it was just as easy to find points of intense disagreeeffects, or in the context of effects that are so variable ment.” The final statement, which was har-won and
as to be undetectable in the studies that are being
long-deliberated, is not riveting reading (have you
conducted,” wrote Dr. Gelman.
ever read a riveting consensus statement?) but it
For his part, Dr. Pocock suggested that the probcame with the caveat that it “does not necessarily
lem started early on, not so much with Dr. Fisher,
reflect the viewpoint of all [the listed parties], and
but with his contemporaries and rivals, Polish
in fact some have views that are in opposition to all
mathematician Jerzy Neyman and UK statistician
or part of the statement.”
Egon Pearson, who introduced the concept of an
Perhaps architect Daniel Libeskind summed it
“accept-reject philosophy of p values.”
up best when he said, “Life is not just a series of
The journals and regulators also share blame,
calculations and a sum total of statistics, it’s about
said Dr. Pocock. The journals because they perpetuexperience, it’s about participation, it is something
ate this “collective certification” and the U.S. Food
more complex and more interesting than what is
and Drug Administration (FDA), not because they
obvious.” ■
strictly apply a p < 0.05 rule to make their decisions. “The FDA is smarter than that in general,”
Dr. Pocock said. Rather, he thinks the issue is that
no one has disabused researchers of the “myth
“It is a mass re-education that is
required to get p values seen
as strength of evidence with
no magic cutoff.”
36 CardioSource WorldNews
References:
1.
Wasserstein RL, Lazar NA. Am Stat. 2016 [Epub
before print].
2. Pocock SJ, McMurray JV, Collier TJ. J Am Coll Cardiol. 2015;66:2536-49. http://content.onlinejacc.
org/article.aspx?articleID=2473760
3. Pocock SJ, McMurray JV, Collier TJ. J Am Coll
Cardiol. 2015;66:2648-62. http://content.onlinejacc.
org/article.aspx?articleID=2474634
4. Pocock SJ, Clayton TC, Stone GW. J Am Coll
Cardiol. 2015;66:2757-66. http://content.onlinejacc.
org/article.aspx?articleID=2476071
5. Pocock SJ, Clayton TC, Stone GW. J Am Coll
Cardiol. 2015;66:2886-98. http://content.onlinejacc.
org/article.aspx?articleID=2476636
6. Gelman A, Loken E. American Scientist. 2014;
102;460.
7. Kyriacou DN. The Enduring Evolution of the P
Value. JAMA. 2016;315:1113-5.
8. Stone GW, Pocock SJ. J Am Coll Cardiol.
2010;55:428-31. http://content.onlinejacc.org/article.
aspx?articleID=1140401
9. Ioannidis JP. PLoS Med. 2005;2:e124.
10. Chavalarias D, Wallach JD, Li AH, Ioannidis JP.
JAMA. 2016;315:1141-8.
11. Head ML, Holman L, Lanfear R, Kahn AT, Jennions
MD. PLoS Biol. 2015;13(3):e1002106.
12. Simonsohn U, Nelson LD, Simmons JP. J Exp Psychol Gen. 2014;143:534-47.
May 2016