NATIONAL BUREAU OF ECONOMIC RESEARCH
NATIONAL BUREAU OF ECONOMIC RESEARCH
loading...

Linking Individuals Across Historical Sources: a Fully Automated Approach

Ran Abramitzky, Roy Mill, Santiago Pérez

NBER Working Paper No. 24324
Issued in February 2018
NBER Program(s):Aging, Development of the American Economy, Labor Studies

Linking individuals across historical datasets relies on information such as name and age that is both non-unique and prone to enumeration and transcription errors. These errors make it impossible to find the correct match with certainty. We suggest a fully automated method for linking historical datasets that enables researchers to create samples that minimize type I (false positives) and type II (false negatives) errors. The first step of the method uses the Expectation-Maximization (EM) algorithm, a standard tool in statistics, to compute the probability that each two observations correspond to the same individual. The second step uses these estimated probabilities to determine which records to use in the analysis. We provide codes to implement this method.

You may purchase this paper on-line in .pdf format from SSRN.com ($5) for electronic delivery.

Access to NBER Papers

You are eligible for a free download if you are a subscriber, a corporate associate of the NBER, a journalist, an employee of the U.S. federal government with a ".GOV" domain name, or a resident of nearly any developing country or transition economy.

If you usually get free papers at work/university but do not at home, you can either connect to your work VPN or proxy (if any) or elect to have a link to the paper emailed to your work email address below. The email address must be connected to a subscribing college, university, or other subscribing institution. Gmail and other free email addresses will not have access.

E-mail:

The NBER Bulletin on Aging and Health provides summaries of publications like this.  You can sign up to receive the NBER Bulletin on Aging and Health by email.

Machine-readable bibliographic record - MARC, RIS, BibTeX

Document Object Identifier (DOI): 10.3386/w24324

Users who downloaded this paper also downloaded* these:
Bryan, Choi, and Karlan w24278 Randomizing Religion: The Impact of Protestant Evangelism on Economic Outcomes
Auclert and Rognlie w24280 Inequality and Aggregate Demand
Bernanke and Gürkaynak Is Growth Exogenous? Taking Mankiw, Romer, and Weil Seriously
Fieldhouse and Mertens w23165 A Narrative Analysis of Mortgage Asset Purchases by Federal Agencies
Cheng and Xiong w19642 The Financialization of Commodity Markets
 
Publications
Activities
Meetings
NBER Videos
Themes
Data
People
About

National Bureau of Economic Research, 1050 Massachusetts Ave., Cambridge, MA 02138; 617-868-3900; email: info@nber.org

Contact Us