How Much Should We Trust Differences-in-Differences Estimates?
Most Difference-in-Difference (DD) papers rely on many years of data and focus on serially correlated outcomes. Yet almost all these papers ignore the bias in the estimated standard errors that serial correlation introduce4s. This is especially troubling because the independent variable of interest in DD estimation (e.g., the passage of law) is itself very serially correlated, which will exacerbate the bias in standard errors. To illustrate the severity of this issue, we randomly generate placebo laws in state-level data on female wages from the Current Population Survey. For each law, we use OLS to compute the DD estimate of its 'effect' as well as the standard error for this estimate. The standard errors are severely biased: with about 20 years of data, DD estimation finds an 'effect' significant at the 5% level of up to 45% of the placebo laws. Two very simple techniques can solve this problem for large sample sizes. The first technique consists in collapsing the data and ignoring the time-series variation altogether; the second technique is to estimate standard errors while allowing for an arbitrary covariance structure between time periods. We also suggest a third technique, based on randomization inference testing methods, which works well irrespective of sample size. This technique uses the empirical distribution of estimated effects for placebo laws to form the test distribution.
Published: Marianne Bertrand & Esther Duflo & Sendhil Mullainathan, 2004. "How Much Should We Trust Differences-in-Differences Estimates?," The Quarterly Journal of Economics, MIT Press, vol. 119(1), pages 249-275, February.