Endogenous Stratification in Randomized Experiments
Researchers and policy makers are often interested in estimating how treatments or policy interventions affect the outcomes of those most in need of help. This concern has motivated the increasingly common practice of disaggregating experimental data by groups constructed on the basis of an index of baseline characteristics that predicts the values that individual outcomes would take on in the absence of the treatment. This article shows that substantial biases may arise in practice if the index is estimated, as is often the case, by regressing the outcome variable on baseline characteristics for the full sample of experimental controls. We analyze the behavior of leave-one-out and repeated split sample estimators and show they behave well in realistic scenarios, correcting the large bias problem of the full sample estimator. We use data from the National JTPA Study and the Tennessee STAR experiment to demonstrate the performance of alternative estimators and the magnitude of their biases.
This paper was revised on April 29, 2014
Document Object Identifier (DOI): 10.3386/w19742
Users who downloaded this paper also downloaded these: