What Threshold Should be Applied to Tests of Factor Models?
Researchers generally acknowledge that statistical tests must be adjusted when hundreds of factors and trading strategies have been examined. But how should these adjustments be made? Existing methods are often misunderstood or misapplied. We show that proper inference requires accounting for dependence across tests, correctly specifying the null distribution, and mitigating sample-selection bias. We develop a simple framework that avoids assumptions about the total number of tests run and yields a lower bound on valid significance thresholds - implying that researchers should employ a t-statistic cutoff of at least 3.0. In addition, we advocate using the local False Discovery Rate, which provides the probability that the null hypothesis is true for a given test-statistic realization - information that a conventional p-value cannot supply.
-
-
Copy CitationCampbell R. Harvey, Alessio Sancetta, and Yuqian Zhao, "What Threshold Should be Applied to Tests of Factor Models?," NBER Working Paper 34898 (2026), https://doi.org/10.3386/w34898.Download Citation