Common Statistical Misinterpretations
Many aspects of null hypothesis significance testing (NHST) are widely misunderstood. This article discusses 12 common fallacies including rarely discussed subtleties of multiple comparisons, meta-analysis, and experiment design.
P-VALUE REPLICABILITY
Many analysts believe that a p-value and point estimate are sufficient for optimal decision making after an A/B test is run. This article details the surprising replication properties of p-values. It also touches on other little known properties of null hypothesis significance testing (NHST), for example that statistically significant lifts tend to overstate the true effect size.
p-value functions
P-value functions provide several benefits over the dichotomous significant/nonsignificant paradigm, instead shifting focus to parameter estimation. P-value functions provide an estimated effect size and the precision of that estimate.
NHST & CI MISINTERPRETATIONS AMONG PSYCHOLOGY RESEARCHERS AND STUDENTS
This article presents a comprehensive review of surveys which have attempted to assess the null hypothesis significance testing (NHST) and confidence interval (CI) knowledge of psychology researchers and students. The review shows that there are high rates of misunderstandings across both researcher and student populations for NHST and CIs.
ARTICLES CITING NHST MISUSE
More than 60 published articles are presented in which researchers have identified that peers in their field misuse or misinterpret null hypothesis significance testing (NHST).
Negative comments about statistical significance
Little known to those outside of the the academic statistics community, statisticians have been debating the proper use of statistical significance for decades due to the misuse and misunderstandings of p-values. This article is a series of critical quotations of statistical significance as this criticism is new to many analysts.
REFERENCES
A set of references regarding statistical significance including hundreds of journal articles critical of null hypothesis significance testing.