Linear Regressions with Combined Data

Xavier D'Haultfoeuille; Christophe Gaillac; Arnaud Maurel

doi:10.3386/w34507

Linear Regressions with Combined Data

Xavier D'Haultfoeuille, Christophe Gaillac & Arnaud Maurel

Working Paper 34507

DOI 10.3386/w34507

Issue Date November 2025

We study linear regressions in a context where the outcome of interest and some of the covariates are observed in two different datasets that cannot be matched. Traditional approaches obtain point identification by relying, often implicitly, on exclusion restrictions. We show that without such restrictions, coefficients of interest can still be partially identified, with the sharp bounds taking a simple form. We obtain tighter bounds when variables observed in both datasets, but not included in the regression of interest, are available, even if these variables are not subject to specific restrictions. We develop computationally simple and asymptotically normal estimators of the bounds. Finally, we apply our methodology to estimate racial disparities in patent approval rates and to evaluate the effect of patience and risk-taking on educational performance.

First Version: December 6, 2024. We thank Pat Bayer, Christian Bontemps, Stephen Hansen, Marc Henry, Toru Kitagawa, Matt Masten, David Pacini, Daniel Wilhelm, and participants at seminars and conferences at the Encounters in Econometric Theory 2024, the Munich Econometrics Workshop 2024, 2024 ESEM (Rotterdam), LMU, Penn State University, the AarhusWorkshop in Econometrics V, the 35th (EC)2 conference in Amsterdam, Toulouse School of Economics, University of Geneva (Statistics), University of Glasgow, University of Gothenburg, the Workshop on Optimal Transport in Econometrics (Collegio Carlo Alberto, 2025), and the 2025 IAAE Annual Conference. We thank Lavinia Kinne and Ludger Woessmann who kindly shared the PISA sample used in our paper. We also thank Yizhi Su and Haonan Ye for capable research assistance. The views expressed herein are those of the authors and do not necessarily reflect the views of the National Bureau of Economic Research.
Copy Citation

Xavier D'Haultfoeuille, Christophe Gaillac, and Arnaud Maurel, "Linear Regressions with Combined Data," NBER Working Paper 34507 (2025), https://doi.org/10.3386/w34507.

Download Citation

MARC RIS BibTeΧ
- Companion R Package (RegCombinBLP) is available at

Linear Regressions with Combined Data

Related

Topics

Programs

More from the NBER