Reverse-Bayes analysis of two common misinterpretations of significance tests

Held, Leonhard (2013). Reverse-Bayes analysis of two common misinterpretations of significance tests. Clinical Trials, 10(2):236-242.

Abstract

BACKGROUND: Misunderstanding of significance tests and P values is widespread in clinical research and elsewhere.
PURPOSE: To assess the implications of two common mistakes in the interpretation of statistical significance tests. The first one is the misinterpretation of the type I error rate as the expected proportion of false-positive results among all those called significant, also known as the false-positive report probability (FPRP). The second is the misinterpretation of a P value as (posterior) probability of the null hypothesis.
METHODS: A reverse-Bayes approach is used to calculate a lower bound on the proportion of truly effective treatments that would ensure the FPRP to be equal or below the type I error rate. A reverse-Bayes approach using minimum Bayes factors (BFs) yields upper bounds on the prior probability of the null hypothesis that would justify the interpretation of the P value as the posterior probability of the null hypothesis.
RESULTS: In a typical clinical trials setting, more than 50% of the treatments need to be truly effective to justify equality of the type I error rate and the FPRP. To interpret the P value as posterior probability, the difference between the corresponding prior probability and the P value cannot exceed 12.4 percentage points.
LIMITATIONS: The first analysis requires that the (one-sided) type I error rate is smaller than the type II error rate. The second result is valid under different scenarios describing how to transform P values to minimum BFs.
CONCLUSIONS: The two misinterpretations imply strong and often unrealistic assumptions on the prior proportion or probability of truly effective treatments.

BACKGROUND: Misunderstanding of significance tests and P values is widespread in clinical research and elsewhere.
PURPOSE: To assess the implications of two common mistakes in the interpretation of statistical significance tests. The first one is the misinterpretation of the type I error rate as the expected proportion of false-positive results among all those called significant, also known as the false-positive report probability (FPRP). The second is the misinterpretation of a P value as (posterior) probability of the null hypothesis.
METHODS: A reverse-Bayes approach is used to calculate a lower bound on the proportion of truly effective treatments that would ensure the FPRP to be equal or below the type I error rate. A reverse-Bayes approach using minimum Bayes factors (BFs) yields upper bounds on the prior probability of the null hypothesis that would justify the interpretation of the P value as the posterior probability of the null hypothesis.
RESULTS: In a typical clinical trials setting, more than 50% of the treatments need to be truly effective to justify equality of the type I error rate and the FPRP. To interpret the P value as posterior probability, the difference between the corresponding prior probability and the P value cannot exceed 12.4 percentage points.
LIMITATIONS: The first analysis requires that the (one-sided) type I error rate is smaller than the type II error rate. The second result is valid under different scenarios describing how to transform P values to minimum BFs.
CONCLUSIONS: The two misinterpretations imply strong and often unrealistic assumptions on the prior proportion or probability of truly effective treatments.