CardioSource WorldNews

that to get a drug approved you must prove p < 0.05.” This is starting to change, however, according to Dr. Pocock, as journals become more sensitive to the issue. Dr. Wasserstein noted that hypothesis testing and —Stuart Pocock, MSc, PhD inference is a tricky concept. His hope is simply that the most recent brouhaha will inspire people to reject the use of a single index as a “substitute for Dr. Gelman places much of the blame for this probscientific reasoning” and “see how we can make the lem on statistical education (generously including science better as we go into the post p < 0.05 era.” his own courses and textbook), which reduces statistics to a mathematical and computation science, How Do We p-Fix This? rather than combining computation with a thought process that emphasizes valid and reliable measureThe trend toward simplistically using and abusing ment. With less time for thinking today, it is probp values is alluring to the many individuals in the ably not surprising that a big white line separating field of medicine who lack deep understanding of ‘yes’ from ‘no’ has gained prominence in reporting statistics. Many physicians, researchers, journal study results. editors, and science writers possess a limited unBlame also belongs to the trend to sell statistics derstanding of statistical sciences, and likely aren’t “as a sort of alchemy that transmutes randomness all that interested in learning more. After all, who into certainty, an “uncertainty laundering” that really understands confidence, credibility, or predic“begins with data and concludes with success as tion intervals (alternative approaches to analysis measured by statistical significance,” as Dr. Gelman suggested by the ASA statement)? How are you puts it. The culprits here are everyone: educators, on Bayesian methods? (See sidebar interview with journals, granting agencies, researchers, and jourstatistician Jan Tijssen, PhD, for his comments on nalists. “Just try publishing a result with p = 0.20,” how to improve data analysis.) he noted. “It is a mass re-education that is required to get “If researchers have been trained with the expecp values seen as strength of evidence with no magic tation that they will get statistical significance if they cutoff,” said Dr. Pocock. work hard and play by the rules, if granting agencies To be fair, even the ASA acknowledges that demand power analyses in which researchers must the practice of statistics is not a straightforward claim 80% certainty that they will attain statistical endeavor, with clearly defined rules of engagement. significance, and if that threshold is required for The ASA team noted in their statement that finding publication, it is no surprise that researchers will points of agreement among the two dozen experts routinely satisfy this criterion, and publish, and gathe red “turned out to be relatively easy to do, but publish, and publish, even in the absence of any real it was just as easy to find points of intense disagreeeffects, or in the context of effects that are so variable ment.” The final statement, which was har-won and as to be undetectable in the studies that are being long-deliberated, is not riveting reading (have you conducted,” wrote Dr. Gelman. ever read a riveting consensus statement?) but it For his part, Dr. Pocock suggested that the probcame with the caveat that it “does not necessarily lem started early on, not so much with Dr. Fisher, reflect the viewpoint of all [the listed parties], and but with his contemporaries and rivals, Polish in fact some have views that are in opposition to all mathematician Jerzy Neyman and UK statistician or part of the statement.” Egon Pearson, who introduced the concept of an Perhaps architect Daniel Libeskind summed it “accept-reject philosophy of p values.” up best when he said, “Life is not just a series of The journals and regulators also share blame, calculations and a sum total of statistics, it’s about said Dr. Pocock. The journals because they perpetuexperience, it’s about participation, it is something ate this “collective certification” and the U.S. Food more complex and more interesting than what is and Drug Administration (FDA), not because they obvious.” ■ strictly apply a p < 0.05 rule to make their decisions. “The FDA is smarter than that in general,” Dr. Pocock said. Rather, he thinks the issue is that no one has disabused researchers of the “myth “It is a mass re-education that is required to get p values seen as strength of evidence with no magic cutoff.” 36 CardioSource WorldNews References: 1. Wasserstein RL, Lazar NA. Am Stat. 2016 [Epub before print]. 2. Pocock SJ, McMurray JV, Collier TJ. J Am Coll Cardiol. 2015;66:2536-49. http://content.onlinejacc. org/article.aspx?articleID=2473760 3. Pocock SJ, McMurray JV, Collier TJ. J Am Coll Cardiol. 2015;66:2648-62. http://content.onlinejacc. org/article.aspx?articleID=2474634 4. Pocock SJ, Clayton TC, Stone GW. J Am Coll Cardiol. 2015;66:2757-66. http://content.onlinejacc. org/article.aspx?articleID=2476071 5. Pocock SJ, Clayton TC, Stone GW. J Am Coll Cardiol. 2015;66:2886-98. http://content.onlinejacc. org/article.aspx?articleID=2476636 6. Gelman A, Loken E. American Scientist. 2014; 102;460. 7. Kyriacou DN. The Enduring Evolution of the P Value. JAMA. 2016;315:1113-5. 8. Stone GW, Pocock SJ. J Am Coll Cardiol. 2010;55:428-31. http://content.onlinejacc.org/article. aspx?articleID=1140401 9. Ioannidis JP. PLoS Med. 2005;2:e124. 10. Chavalarias D, Wallach JD, Li AH, Ioannidis JP. JAMA. 2016;315:1141-8. 11. Head ML, Holman L, Lanfear R, Kahn AT, Jennions MD. PLoS Biol. 2015;13(3):e1002106. 12. Simonsohn U, Nelson LD, Simmons JP. J Exp Psychol Gen. 2014;143:534-47. May 2016

CardioSource WorldNews | Page 38