When to Control for Covariates? Panel-Asymptotic Results for Estimates of Treatment Effects
Joshua D. Angrist, Jinyong Hahn
NBER Technical Working Paper No. 241
The problem of how to control for covariates is endemic in evaluation research. Covariate-matching provides an appealing control strategy, but with continuous or high-dimensional covariate vectors, exact matching may be impossible or involve small cells. Matching observations that have the same propensity score produces unbiased estimates of causal effects whenever covariate-matching does, and also has an attractive dimension-reducing property. On the other hand, conventional asymptotic arguments show that covariate-matching is (asymptotically) more efficient that propensity score-matching. This is because the usual asymptotic sequence has cell sizes growing to infinity, with no benefit from reducing the number of cells. Here, we approximate the large sample behavior of difference matching estimators using a panel-style asymptotic sequence with fixed cell sizes and the number of cells increasing to infinity. Exact calculations in simple examples and Monte Carlo evidence suggests this generates a substantially improved approximation to actual finite-sample distributions. Under this sequence, propensity-score-matching is most likely to dominate exact matching when cell sizes are small, the explanatory power of the covariates conditional on the propensity score is low, and/or the probability of treatment is close to zero or one. Finally, we introduce a random-effects type combination estimator that provides finite-sample efficiency gains over both covariate-matching and propensity-score-matching.