Administrative Data Linking and Statistical Power Problems in Randomized Experiments

Sarah Tahamont, Zubin Jelveh, Aaron Chalfin, Shi Yan, Benjamin Hansen

NBER Working Paper No. 25657
Issued in March 2019
NBER Program(s):, Technical Working Papers

The increasing availability of administrative data has led to a particularly exciting innovation in public policy research, that of the “low-cost” randomized trial in which administrative data are used to measure outcomes in lieu of costly primary data collection. Linking data from an experimental intervention to administrative records that track outcomes of interest typically requires matching datasets without a common unique identifier. In order to minimize mistaken linkages, researchers will often use “exact matching” (retaining an individual only if all their demographic variables match exactly in two or more datasets) in order to ensure that speculative matches do not lead to errors in an analytic dataset. We argue that when this approach is used to detect the presence of a binary outcome, this seemingly conservative approach leads to attenuated estimates of treatment effects, and critically, to underpowered experiments. For marginally powered studies, which are common in empirical social science, exact matching is particularly problematic. In this paper, we derive an analytic result for the consequences of linking errors on statistical power and show how the problem varies across different combinations of relevant inputs, including the matching error rate, the outcome density and the sample size. We conclude on an optimistic note by showing that machine learning-based probabilistic matching algorithms allow researchers to recover a considerable share of the statistical power that is lost to errors in data linking.

You may purchase this paper on-line in .pdf format from ($5) for electronic delivery.

Access to NBER Papers

You are eligible for a free download if you are a subscriber, a corporate associate of the NBER, a journalist, an employee of the U.S. federal government with a ".GOV" domain name, or a resident of nearly any developing country or transition economy.

If you usually get free papers at work/university but do not at home, you can either connect to your work VPN or proxy (if any) or elect to have a link to the paper emailed to your work email address below. The email address must be connected to a subscribing college, university, or other subscribing institution. Gmail and other free email addresses will not have access.


Machine-readable bibliographic record - MARC, RIS, BibTeX

Document Object Identifier (DOI): 10.3386/w25657

NBER Videos

National Bureau of Economic Research, 1050 Massachusetts Ave., Cambridge, MA 02138; 617-868-3900; email:

Contact Us