Endogenous Stratification in Randomized Experiments
Researchers and policy makers are often interested in estimating how treatments or policy interventions affect the outcomes of those most in need of help. This concern has motivated the increasingly common practice of disaggregating experimental data by groups constructed on the basis of an index of baseline characteristics that predicts the values that individual outcomes would take on in the absence of the treatment. This article shows that substantial biases may arise in practice if the index is estimated, as is often the case, by regressing the outcome variable on baseline characteristics for the full sample of experimental controls. We analyze the behavior of leave-one-out and repeated split sample estimators and show they behave well in realistic scenarios, correcting the large bias problem of the full sample estimator. We use data from the National JTPA Study and the Tennessee STAR experiment to demonstrate the performance of alternative estimators and the magnitude of their biases.
We thank Beth Akers, Josh Angrist, Matias Cattaneo, Gary Chamberlain, David Deming, Sara Goldrick- Rab, Josh Goodman, Jerry Hausman, Guido Imbens, Max Kasy, Larry Katz, Amanda Pallais, Paul Peterson, Russ Whitehurst, and seminar participants at Harvard/MIT for helpful comments and discussions, and Jeremy Ferwerda for developing estrat (available at SSC), a Stata package that calculates the leave- one-out and repeated split sample endogenous stratification estimators considered in this study. The views expressed herein are those of the authors and do not necessarily reflect the views of the National Bureau of Economic Research.
Alberto Abadie & Matthew M. Chingos & Martin R. West, 2018. "Endogenous Stratification in Randomized Experiments," The Review of Economics and Statistics, vol 100(4), pages 567-580. citation courtesy of