Negative comments about statistical significance and p-values
Article summary: This article is a list of critical statements of p-values and statistical significance. This is not meant to imply that p-values have no value in science or in analyzing results. However, most analysts outside the halls of academia believe that statistical significance is the “gold standard” for experimental results. This article is meant to guide those outside fo statistics away from that view by sharing a history of criticism. The proper usage of p-values in statistics is still under debate, but it is agreed upon that statistical significance is not a “gold standard.” Instead, the gold standard for research includes thoughtful usage of p-values as well as other statistical metrics like effect sizes and various intervals of precision like confidence intervals.
Contact information
We’d love to hear from you! Please direct all inquires to info@theresear.ch
Reference: “Moving to a World Beyond ‘p < 0.05’” [link], The American Statistician. Ronald Wasserstein, Executive Director of The American Statistical Association [link]; Allen Schirm, Vice President and Director of Human Services Research at Mathematica Policy Research (retired) [link]; & Nicole Lazar, Professor of Statistics at the University of Georgia and President Elect of Caucus for Women in Statistics [link]. This article was part of The American Statistician’s March 2019 special edition, “Statistical Inference in the 21st Century: A World Beyond p < 0.05” [link].
Reference: “Significance tests as sorcery: Science is empirical - significance tests are not” [link], Theory & Psychology, 2012. Charles Lambdin, Intel Corporation, former researcher at Wichita State University.
Reference: “Scientists rise up against statistical significance” [link], Nature. Valentin Amrhein, Professor of Zoology at the University of Basel [link]; Sander Greenland, Professor Emeritus at the UCLA Fielding School of Public Health [link]; & Blake McShane, Associate Professor of Marketing at Northwestern’s Kellogg School of Management [link]. The full list of 854 scientists from 52 countries signing on to “Retire statistical significance” can be found here: [link].
Reference: “The American Statistical Association’s Statement on p-values: Context, Process, and Purpose” [link], The American Statistician, by By Ronald Wasserstein, Executive Director of The American Statistical Association [link] & Nicole Lazar, Professor of Statistics at the University of Georgia and President Elect of Caucus for Women in Statistics [link].
Reference: “Moving Towards the Post p < 0.05 Era via the Analysis of Credibility” [link], The American Statistician. Robert Matthews, Professor of Mathematics at Aston University [link]. This article was part of The American Statistician’s March 2019 special edition, “Statistical Inference in the 21st Century: A World Beyond p < 0.05” [link].
Reference: “Rejection odds and rejection ratios: A proposal for statistical practice in testing hypotheses” (1999), Journal of Mathematical Psychology [link], by Bayarri et. al.
Reference: “Why Most Published Research Findings Are False” [link], PLoS ONE. John P.A. loannidis of Stanford University, the C.F. Rehnborg Chair in Disease Prevention; Professor of Medicine, of Health Research and Policy, of Biomedical Data Science, and of Statistics; co-Director, Meta-Research Innovation Center at Stanford; Director of the PhD program in Epidemiology and Clinical Research [link].
Reference: “The New Statistics for Better Science: Ask How Much, How Uncertain, and What Else Is Known” [link], The American Statistician. Robert Calin-Jageman, Associate Professor of Psychology and Discipline Director of Neuroscience [link] & Geoff Cumming Emeritus Professor of Psychology at La Trobe University [link] . This article was part of The American Statistician’s March 2019 special edition, “Statistical Inference in the 21st Century: A World Beyond p < 0.05” [link].
Reference: “Abandon Statistical Significance” [link], The American Statistician. Jennifer Tackett, Professor of Psychology and Director of Clinical Psychology at Weinberg College [link]; Christian Robert, Professor of Statistics at University of Warwick [link]; Andrew Gelman, Professor of Statistics and Director of the Applied Statistics Center at Columbia University [link]; David Gal, Professor of Marketing at the University of Illinois [link]; Blake McShane, Associate Professor of Marketing at Northwestern’s Kellogg School of Management [link]. This article was part of The American Statistician’s March 2019 special edition, “Statistical Inference in the 21st Century: A World Beyond p < 0.05” [link].
Reference: “Theoretical Risks and Tabular Asterisks: Sir Karl, Sir Ronald, and the Slow Progress of Soft Psychology” [link], Journal of Consulting and Clinical Psychology, 1978. Paul Meehl (deceased), Professor of Psychology at the University of Minnesota and past president of the American Psychological Association [link].
Reference: “The fickle P value generates irreproducible results” [link], Nature Methods [link], 2015. Lewis G Halsey, Professor of Life Sciences and the University of Roehampton London and head of the Roehampton University Behaviour and Energetics Lab (RUBEL) [link]; Douglas Curran-Everett, Division of Bioinformatics at the National Jewish Health Hospital and the Department of Biostatistics and Informatics at the University of Colorado Denver’s School of Public Health [link, link]; Sarah L Vowle, Cancer Research UK Cambridge Institute at the University of Cambridge [link]; Gordon B Drummond, Honorary Clinical Senior Lecturer of Anaesthesia at The University of Edinburgh [link]
Reference: “Redefine statistical significance” [link], Nature Human Behaviour, 2017. Daniel Benjamin plus 71 coauthors signed on to a proposal to lower the statistical significance threshold to 0.005. Their proposal is not without controversy [link], but there has been little disagreement about the problem they are trying to solve.
Reference: “Statistical Significance and the Dichotomization of Evidence” [link], Journal of the American Statistical Association, 2017. David Gal, Professor of Marketing at the University of Illinois [link]; Blake McShane, Associate Professor of Marketing at Northwestern’s Kellogg School of Management [link].
Reference: “Null Hypothesis Testing: Problems, Prevalence, and an Alternative,” Journal of Wildlife Management, 2000. David Anderson (employment unknown), former scientist at the Cooperative Fish and Wildlife Research Unit [link]; Kenneth Burnham (retired), former Senior Scientist with the United States Geological Survey [link]; William Thompson, Adjunct Professor in Natural Resources at the University of Rhode Island, National Park Service Research Coordinator for the North Atlantic Coast Cooperative Ecosystem Studies Unit [link].
Reference: “Statistical Significance Tests are Unnecessary Even When Properly Done and Properly Interpreted: Reply to Commentaries” [link], International Journal of Forecasting, 2007. J. Scott Armstrong [link], Professor of Marketing at the University of Pennsylvania’s Wharton Business School, cofounder of the Journal of Forecasting [link], International Journal of Forecasting [link], International Institute of Forecasters [link], International Symposium on Forecasting [link], and PollyVote.com [link].
Reference: “Statistical Significance Tests, Effect Size Reporting and the Vain Pursuit of Pseudo-Objectivity”, Research article, 1999. Bruce Thompson [retired], former Distinguished Professor of Educational Psychology and Library Sciences at Texas A&M University and former Adjunct Professor of Community Medicine at the Baylor Collect of Medicine [link].
Reference: “Problems With Null Hypothesis Significance Testing (NHST): What Do the Textbooks Say?” [link], The Journal of Experimental Education, 2002. Jeffrey Gliner, Professor Emeritus, Occupational Therapy at Colorado State University [link]; Nancy Leech, Professor of Education and Human Development at the University of Colorado, Denver [link]; George Morgan, Professor Emeritus, Education and Human Development at Colorado State University [link].
Reference: “The Earth Is Round (p < .05)” [link], American Psychologist, 1994. Jacob Cohen (deceased) [link], former Professor of Psychology at New York University and head of the Quantitative Psychology Group; 1997 winner of the Distinguished Lifetime Achievement Award by the American Psychological Association; fellow of the American Association for the Advancement of Science, the American Psychological Association, and the American Statistical Association; inventor of three statistical measures: Cohen’s d, Cohen’s h, and Cohen’s kappa.
Reference: “A Sensible Formulation of the Significance Test” [link], Psychological Methods, 2000. John Tukey (deceased) [link], Professor and Founding Chairman of the Princeton University Statistics department; AT&T Bell Labs researcher; recipient of the National Medal of Science; recipient of the IEEE Medal of Honor; inventor of the box plot, the Fast Fourier Transform, and the Tukey range test; credited with coining the term “bit”. Lyle Jones (deceased) [link], Professor of Psychology at the University of Chicago, the University of Texas, and the University of North Carolina at Chapel Hill; Director of UNC’s Psychometric Laboratory; Managing editor of Psychometrika; President of the Psychometric Society.
Reference: “Statistical Significance in Psychological Research” [link], Psychological Bulletin, 1968. David Lykken [link], Professor Emeritus of Psychology and Psychiatry at the University of Minnesota, Fellow of the American Association for the Advancement of Science, Fellow of the American Psychological Association, and Charter Fellow of the American Psychological Society.
Reference: “Eight Common But False Objections to the Discontinuation of Significance Testing in the Analysis of Research Data” [link], white paper, 1997. Frank Schmidt, former Professor of Psychology at the University of Iowa, won multiple scientific awards and sat on the editorial boards of eight different research journals [link]; John Hunter (deceased), former Professor of Psychology at Michigan State University, Distinguished Scientific Award for Contributions to Applied Psychology (joint with Frank L. Schmidt), and the Distinguished Scientific Contributions Award from the Society for Industrial and Organizational Psychology (SIOP) (also joint with Schmidt).
Reference: “Some General Aspects of the Theory of Statistics” [link], International Statistical Review, 1986. David Cox, former Chair of Statistics at Imperial College London and member of the Department of Statistics at Oxford University; winner of numerous awards including the prestigious International Prize in Statistics; pioneer of numerous statistical tools including binary logistic regression and proportional hazards models. Cox was a constant critic of significance tests, for example in Cox 1977 [link] where he stated that, “Overemphasis on tests of significance at the expense especially of interval estimation has long been condemned” and “The continued very extensive use of significance tests is alarming.”
Reference: “Significance Tests Die Hard: The Amazing Persistence of a Probabilistic Misconception” [link], Theory & Psychology, 1995. Ruma Falk, Professor Emeritus of Psychology at The Hebrew University of Jerusalem [link]; Charles Greenbaum, Professor Emeritus of Psychology at The Hebrew University of Jerusalem [link].
Reference: “The illogic of statistical inference for cumulative science”, Applied Stochastic Models and Data Analysis, [link], 1985. Louis Guttman, Professor of Social and Psychological Assessment at the Hebrew University of Jerusalem, winner of the Israel Prize in the social sciences (1978), winner of the Educational Testing Service Measurement Award from Princeton University (1984) [link].
Reference: “The philosophy of quantitative methods” [link], Oxford handbook of Quantitative Methods, 2012. Brian Haig, Professor of Psychology at University of Canterbury.
Reference: “The Historical Growth of Statistical Significance Testing in Psychology–and Its Future Prospects” [link], Educational and Psychological Measurement, 2000. Raymond Hubbard, Professor Emeritus of Marketing at Drake University [link]; Patricia Ryan, Associate Professor of Finance and Real Estate at Colorado State University [link].
Reference: The Cult of Statistical Significance: How the Standard Error Costs Us Jobs, Justice, and Lives [link], University of Michigan Press, 2008. Stephen Ziliak, faculty member of the Angiogenesis Foundation, conjoint Professor of Business and Law at the University of Newcastle in Australia, and Professor of Economics at Roosevelt University [link]; Deirdre McCloskey, Distinguished Professor of Economics, History, English, and Communication at the University of Illinois at Chicago [link].
Reference: “Statistical Procedures and the Justification of Knowledge in Psychological Science”, [link], American Psychologist, 1989. Ralph Rosnow (retired), Professor of Psychology at Temple University [link]; Robert Rosenthal, Professor of Psychology at the University of California, Riverside [link].
Reference: “Should Significance Tests be Banned? Introduction to a Special Section Exploring the Pros and Cons” [link], Psychological Science, 1997. Patrick Shrout, .
Reference: “The null hypothesis significance test in health sciences research (1995-2006): statistical analysis and interpretation” [link], BMC Medical Research Methodology, 2010. Luis Carlos Silva-Ayçaguer, Professor at Centro Nacional de Investigación de Ciencias Médicas [link]; Patricio Suárez-Gil, Hospital de Cabueñes, Servicio de Salud del Principado de Asturias (SESPA) [link]; Ana Fernández-Somoano, CIBER Epidemiología y Salud Pública (CIBERESP), Spain and Departamento de Medicina, Unidad de Epidemiología Molecular del Instituto Universitario de Oncología.
Reference: “The ongoing tyranny of statistical significance testing in biomedical research” [link], European Journal of Epidemiology, 2010. Andreas Stang, Professor of Epidemiology at the Medical Faculty, University of Duisburg-Essen [link]; Charles Pool, Professor of Epidemiology at the University of North Carolina [link]; Oliver Kuss, former Acting Director of the Institute of Medical Statistics, Düsseldorf University Hospital and Medical Faculty of the Heinrich Heine University Düsseldorf [link].
Reference: “An Alternative to Null-Hypothesis Significance Tests” [link], Psychological Science, 2006. Peter Killeen, Professor of Psychology at Arizona State University, Fellow of the American Psychological Association, the American Psychological Society, and the Association for Behavior Analysis; former President of the Society for Quantitative Analysis of Behavior [link].