Article Summary

The list of articles below was last compiled by Bill Thompson in 2001 while he was a postdoctoral researcher at Colorado State University. We reprint it here with Bill’s permission including his original commentary. We are, of course, indebted to Bill for the use of this list, which undoubtedly took him many hours to compile.

Bill’s list contains 36 articles supporting — as he puts it “to varying degrees” — the use of the null hypothesis significance testing (NHST). This list is a shorter companion to Bill’s 402 articles questioning the use of NHST. This list is part of The Research’s larger project to document the many critiques of NHST and to apply those critiques in specific business contexts (ex. A/B web testing).

Note that Bill’s citation style is slightly more formal than that we have used elsewhere on this site. Also note that we are currently in the process of adding direct links to all 36 articles, although this will take some time.

Here’s Bill…

36 Articles/Book Chapters Supporting (To Varying Degrees) Use of Statistical Hypothesis Tests

1. Abelson, R. P. 1997a. On the surprising longevity of flogged horses: why there is a case for the significance test. Psychological Science 8:12-15.

2. Abelson, R. P. 1997b. A retrospective on the significance test ban of 1999 (if there were no significance tests, they would be invented). Pages 117-141 in L. L. Harlow, S. A. Mulaik, and J. H. Steiger, eds. What if there were no significance tests? Lawrence Erlbaum Associates, Mahwah, N.J.

3. Batanero, C. 2000. Controversies around the role of statistical tests in experimental research. Mathematical Thinking and Learning 2(1-2):75-97.

4. Biskin, B. H. 1998. Comment on significance testing. Measurement and Evaluation in Counseling and Development 31:58-62.

5. Chow, S. L. 1988. Significance test or effect size? Psychological Bulletin 103:105-110.

6. Chow, S. L. 1996. Statistical significance: rationale, validity and utility. Sage Publ., London, U.K. 205pp.

7. Chow, S. L. 1998. Precis of ‘Statistical Significance: Rationale, Validity, and Utility’. Behavioral and Brain Sciences 21(2):169-238 . (with responses)

8. Cortina, J. M., and W. P. Dunlap. 1997. On the logic and purpose of significance testing. Psychological Methods 2:161-172.

9. Eastwood, G. R. 1967. A note on null hypothesis testing. Alberta Journal of Educational Research 13:265-273.

10. Fleiss, J. L. 1986. Significance tests have a role in epidemiologic research: reactions to A. M. Walker. (Different Views) American Journal of Public Health 76:559-560.

11. Fleiss, J. L. 1986. Confidence intervals vs. significance tests: quantitative interpretation. (Letter) American Journal of Public Health 76:587.

12. Fleiss, J. L. 1986. Dr. Fleiss responds. (Letter) American Journal of Public Health 76:1033-1044.

13. Frick, R. W. 1995. Accepting the null hypothesis. Memory and Cognition 23:132-138.

14. Frick, R. W. 1996. The appropriate use of null hypothesis testing. Psychological Methods 1:379-390.

15. Greenwald, A. G., R. Gonzalez, R. J. Harris, and D. Guthrie. 1996. Effect sizes and p values: What should be reported and what should be replicated? Psychophysiology 33:175-183.

16. Hagen, R. L. 1997. In praise of the null hypothesis statistical test. American Psychologist 52:15-24.

17. Hagan, R. L. 1998. A further look at wrong reasons to abandon statistical testing. American Psychologist 53:801-803.

18. Harris, E. K. 1993. On p values and confidence intervals (why can=t we p with more confidence?). Clinical Chemistry 39:927-928.

19. Harris, R. J. 1997. Significance tests have their place. Psychological Science 8:8-11.

20. Howard, G. S., S. E. Maxwell, and K. J. Fleming. 2000. The proof of the pudding: an illustration of the relative strengths of null hypothesis, meta-analysis, and Bayesian analysis. Psychological Methods 5:315-332.

21. Huelsenbeck, J. P., and B. Rannala. 1997. Phylogenetic methods come of age: testing hypotheses in an evolutionary context. Science 276:227-232.

22. King, M. L. 1996. Hypothesis testing in the presence of nuisance parameters. Journal of Statistical Planning and Inference 50:103-120.

23. Knapp, T. R. 1998. Comments on the statistical significance testing articles. Research in the Schools 5(2):39-41.

24. Langholtz, B., J. S. Witte, and T. C. Duncan. 1995. Re: AStatistical significance testing in the American Journal of Epidemiology, 1970-1990.@. American Journal of Epidemiology 142:101.

25. Leventhal, L. 1999. Answering two criticisms of hypothesis testing. Psychological Reports 85:3-18.

26. Levin, J. R. 1993. Statistical significance testing from three perspectives. Journal of Experimental Education 61:378-382.

27. Levin, J. R. 1998. What if there were no more bickering about statistical significance tests? Research in the Schools 5(2):43-53.

28. Levin, J. R., and D. H. Robinson. 1999. Further reflections on hypothesis testing and editorial policy for primary research journals. Educational Psychology Review 11:143-155.

29. Levin, J. R., and D. H. Robinson. 2000. Rejoinder: statistical hypothesis testing, effect size estimation, and the conclusion coherence of primary research studies. Educational Researcher 29:34-36.

30. Marshall, D. D. 1999. Observations on the usefulness of null hypothesis testing. Journal of Theory Construction and Testing 3:25-31.

31. Mulaik, S. A., N. S. Raju, and R. A. Harshman. 1997. There is a time and a place for significance testing. Pages 65-115 in L. L. Harlow, S. A. Mulaik, and J. H. Steiger, eds. What if there were no significance tests? Lawrence Erlbaum Associates, Mahwah, N.J.

32. Robinson, D. H., & Levin, J. R. 1997. Reflections on statistical and substantive significance. Educational Researcher 26:21-29.

33. Schultz, B. B. 1989. In support of the use of significance tests and of unplanned pairwise comparisons. Environmental Entomology 18:901-907.

34. Stewart, D. W. 2000. Testing statistical significance testing: some observations of an agnostic. Psychological Measurement 60:685-690.

35. Wainer, H. 1999. One cheer for null hypothesis significance testing. Psychological Methods 6:212-213.

36. Whitmore, G. A., and E. Xekalaki. 1990. P-values as measures of predictive validity. Biometrical Journal 32:977-983.