Ancestry.com and IPUMS Complete Count Restricted File

Ancestry.com has sponsored the digitization of the available complete count census files and allowed IPUMS to offer all but the respondent names on its website to the general user population.

Requesting Access

The Restricted Full Count IPUMS Data is available to affiliated researchers at the NBER by special arrangement through IPUMS. Affiliates may sponsor dissertation projects for students under their supervision. NBER affiliates wishing to use the IPUMS-RESTRICTED (Ancestry.com) census files for a project can apply by completing the following.

Application
NDA Agreement for each team member, if not on file yet
CITI training, if not on file yet
Send the documents above to our Restricted Data Team. Please include the following information for each team member in the email:
- Last Name:
- First Name:
- Affiliation:
- Email:
- Cell phone number for two-factor authentication:
- MyNBER Username: If you don't have one, one will be created for you with the format fistname_lastname.

Once approved and assigned a project number by IPUMS, the application will be forwarded to the NBER IRB for review. The IRB may follow-up with additional questions.

Once all approvals are in place, an NBER team member will contact you with instructions on how to access the data.

Approvals are on a per project basis. New research ideas must be submitted separately.

Adding Team Members

To add an investigator or RA to an existing project send a signed agreement form marked with the project title and IPUMS number. The approval process is the same and you will be notified when the new researcher can access the data.

NDA Agreement for each team member, if not on file yet
CITI training, if not on file yet
Send the documents above to our Restricted Data Team. Please include the following information for each team member in the email:
- Last Name:
- First Name:
- Affiliation:
- Email:
- Cell phone number for two-factor authentication:
- MyNBER Username: If you don't have one, one will be created for you with the format fistname_lastname.

Confidentiality Considerations

Each project will be assigned a shared work folder and unix group to help facilitate sharing between members. Please keep any code and files within these folders to ensure they are protected. It is important to respect the agreement to ensure continued access to this important resource, for you and your colleagues.

Census files (or extracts) may be processed on our servers but should not be downloaded from them.

Linking between the restricted and public versions is fine, but the data must be maintained/analyzed within the NBER computing environment.

Citation

Publications and research reports based on the IPUMS USA database must cite it appropriately. The citation should include the following:

Steven Ruggles, Matt A. Nelson, Matthew Sobek, Catherine A. Fitch, Ronald Goeken, J. David Hacker, Evan Roberts, and J. Robert Warren. IPUMS Ancestry Full Count Restricted Data: Version 4.0R [dataset]. Minneapolis, MN: IPUMS, 2024. https://doi.org/10.18128/D014.V4.0R

Documentation

The IPUMS website covers all the publicly available variables. The additional variables in the restricted-use file include:
- namefrst: 16 character first name (and possibly middle initial)
- namelast: 16 character last name
- histid: 36 character person id for matching across IPUMS versions (but not census decades)
- street: street address

File Structure

The original files are hierarchical, but we have created the dta etc files as rectangular person datasets. That is, the household record is appended to each person record. We also apply the scaling factors in the IPUMS supplied code, which should conform the data to the documentation. Value labels that are merely the ASCII expression of the numeric value are dropped.

New Folder/File Structure

In early 2026 the folder structure of the restricted data was reorganized to reflect the data's census year and version number following IPUMS releases: /homes/data/census-ipums/YYYY/v#.# where YYYY is the year of the census, for example 1950, and v#.# is the version number, v2.2. There is also a latest folder that points to the last version and will be updated whenever a new version is added.

Additionally we now provide two versions of the processed output files, each containing different sets of variables: the primary file, and what we are calling the othervars file.

The primary file, named using just the census year (e.g., 1950.csv, 1950.dta, etc.), contains:
- all variables given descriptive names by IPUMS (e.g., statefip, age, sex, race, etc.)
- variables needed to merge this file with the othervars file
The othervars file (e.g., 1950_othervars.csv, 1950_othervars.dta) contains:
- variables not given a descriptive name by IPUMS (e.g., us1950b_0010, us1950b_1022, etc.)
- variables needed to merge this file with the primary file.

Per IPUMS, it is best practice to use the histid variable to perform merges because it does not change over time.

You can refer to the data dictionary files in the `docs' directory for the year and version you are using to determine the contents of these generically named variables.

Previous versions (Pre-2025) of the census files can be found in the in the data_archive folder preserving the vYYYY format used in the past.

Detailed information on the folder and file reorganization, please refer to this document.

In /homes/data/census-ipums/research-projects you can find data and code of premade matches made available by researchers that might be useful for your project. There are considerable savings in time and resources in using a pre-made match. Please make sure to cite the authors if you use their work.

Multigenerational Longitudinal Panel: Linked Census Data The IPUMS Multigenerational Longitudinal Panel (MLP) project links individuals' records between censuses from 1850 to 1950.
- Steven Ruggles, Matt A. Nelson, Matthew Sobek, Catherine A. Fitch, Ronald Goeken, J. David Hacker, Evan Roberts, and J. Robert Warren. IPUMS Ancestry Full Count Data: Version 4.0 [dataset]. Minneapolis, MN: IPUMS, 2024. doi:10.18128/D014.V4.0R
Census Linking Project: Princeton created a set of linked datasets between every historical Census pair using a variety of automated methods. Publications using data from the matches should cite the Census Linking Project as:
- Ran Abramitzky, Leah Boustan, Katherine Eriksson, Santiago Pérez and Myera Rashid. Census Linking Project: Version 3.0 [dataset]. 2025. https://censuslinkingproject.org
Census Tree: This Census Tree is the largest-ever database of record links among the historical U.S. censuses, with over 700 million links for people living in the United States between 1850 and 1940. The Census Tree includes 314 million census-to-census links for women, and 41 million links for Black Americans. Please refer to https://www.censustree.org/overview on how to cite this data.
- Joseph Price, Kasey Buckles, Jacob Van Leeuwen, and Isaac Riley. “Combining Family History and Machine Learning to Link Historical Records: The Census Tree Data Set.” Explorations in Economic History, 80, 101391. 2021.
- Kasey Buckles, Adrian Haws, Joseph Price, and Haley Wilbert. “Breakthroughs in Historical Record Linking Using Genealogy Data: The Census Tree Project.” 2024. Prior version available as NBER Working Paper #31671.
Location Project: Census Place Project crosswalks that link historical decennial census microdata from IPUMS (indexed by histid) with standardize locations (longitude/latitude pairs). For more details about the crosswalks, see https://ezrakarger.com/census_place_project.pdf.
If you have any questions or concerns, email nenckap@miamioh.edu or karger@uchicago.edu.
RepresentativeCensusLinks: Contains crosswalks to link census records for men and women in the United States from 1850 to 1950, constructed by combining historical census records with Social Security Number (SSN) application data. The dataset enables tracking individuals across multiple censuses despite name changes (e.g., due to marriage) and is particularly valuable for including women in the study of intergenerational mobility. If you use this data, please cite:
- Althoff, Lukas, Brookes Gray, Harriet, & Reichardt, Hugo (2024). America’s Rise in Human Capital Mobility. [Working Paper](https://lukasalthoff.gi thub.io/pdf/igm_mothers.pdf).

Please reach out to the Data Team if you wish to make your code/data available for other approved researchers or replication in the research_projects folder.

Exporting Data

Exporting Data: It is sometimes possible to export data from the NBER for external processing. This is not about releasing data for public access, and in general, exporting string variables are not approved except in rare cases.