data download

Data from
"Causal Effects in Non-Experimental Studies: Reevaluating the Evaluation of Training Programs," Journal of the American Statistical Association, Vol. 94, No. 448 (December 1999), pp. 1053-1062.

and

"Propensity Score Matching Methods for Non-Experimental Causal Studies," Review of Economics and Statistics, Vol. 84, (February 2002), pp. 151-161.

The data are drawn from a paper by Robert Lalonde, "Evaluating the Econometric Evaluations of Training Programs," American Economic Review, Vol. 76, pp. 604-620. We are grateful to him for allowing us to use this data, assistance in reading his original data tapes, and permission to publish it here.
 

NSW Data Files (Lalonde Sample)
These files contain the treated and control units from the male sub-sample from the National Supported Work Demonstration as used by Lalonde in his paper.

These are text files. The order of the variables from left to right is: treatment indicator (1 if treated, 0 if not treated), age, education, Black (1 if black, 0 otherwise), Hispanic (1 if Hispanic, 0 otherwise), married (1 if married, 0 otherwise), nodegree (1 if no degree, 0 otherwise), RE75 (earnings in 1975), and RE78 (earnings in 1978). The last variable is the outcome; other variables are pre-treatment.

  • nsw.dta NSW treated and control observations in Stata format


NSW Data Files (Dehejia-Wahha Sample)
Based on pre-intervention variables, we extract a further subset of Lalonde's NSW experimental data, a subset containing information on RE74 (earnings in 1974):

The variables from left to right are: treatment indicator (1 if treated, 0 if not treated), age, education, Black (1 if black, 0 otherwise), Hispanic (1 if Hispanic, 0 otherwise), married (1 if married, 0 otherwise), nodegree (1 if no degree, 0 otherwise), RE74 (earnings in 1974), RE75 (earnings in 1975), and RE78 (earnings in 1978).

  • nsw_dw.dta NSW treated and control observations (Dehejia-Wahba Sample) in Stata format

 

PSID and CPS Data Files
These six files contain the non-experimental comparison groups constructed by Lalonde from the Population Survey of Income Dynamics and the Current Population Survey, and the further subsets he created from the two basic comparison groups. CPS2 and CPS3 are very similar to, but not exactly the same as, as Lalonde's subsets; for CPS, we were unable to re-create his subsets exactly. The variables from left to right are: treatment indicator (1 if treated, 0 if not treated), age, education, Black (1 if black, 0 otherwise), Hispanic (1 if Hispanic, 0 otherwise), married (1 if married, 0 otherwise), nodegree (1 if no degree, 0 otherwise), RE74 (earnings in 1974), RE75 (earnings in 1975), and RE78 (earnings in 1978).