NCHS' Vital Statistics Natality Birth Data -- 1968-2014
Natality Data from the National Vital Statistics System of the National Center for Health Statistics provide demographic and health data for births occuring during the calendar year. The microdata are based on information abstracted from birth certificates filed in vital statistics offices of each State and District of Columbia.
Demographic data include variables such as date of birth, age and educational attainment of parents, marital status, live-birth order, race, sex, and geographic area. Health data include items such as birth weight, gestation, prenatal care, attendant at birth, and Apgar score. Geographic data includes state, county, city (available for cities of 250,000+ (up to 1980) and 100,000+ (1980-)), SMSA (1980-), and metropolitan and nonmetropolitan counties.
Population files (such as natpop91.dat.Z) contain the population counts for U.S. women 15-44, those traditionally thought to be "at risk" for giving birth. The files have 2448 lines. Each line represents the count of one combination of 51 state x 6 age x 4 race x 2 Hispanic origin of mother categories. These files are available for 1991 on. Population files are not available for the U.S. Territories.
SEER provides helpful U.S. Population data for 1969 on.
To check your ability to uncompress these files, download the small files compress.Z or compress.zip. These files give an example of how to read in .Z and .zip ASCII files into SAS for UNIX without decompressing the files. To download files in Internet Explorer, right click on them and select "Save Target As...". If the pdf documents appear to be all blank pages, get the latest Acrobat Reader at www.abobe.com.
Variable layouts are basically the same for periods 1972-1977, 1979-1981, 1982-1983, 1984-1985, and 1992-1994 though a few codes change across years.
Thanks to Michael Greenstone and Kenneth Chay for the 1975-1985 data.
Source file size: The compressed 1968-1985 files are between 50 and 130 Mb, and the compressed 1991-1994, and 1998-2002 files are 120-155 Mb. The 2003 file is over 200Mb. The compression ratio for these files is over 90%.
Because of the large size of the complete collection, we would prefer that you not download large fractions over the web. NBER internal users can obtain the data from a UNIX shell at /homes/data/natality or on an NBER PC via Network Neighborhood --> NBER --> home --> data --> natality
* The 2003 datafile is nearly four times larger than files from previous years. This is because while the 2002 file is 352 characters wide, the 2003 file is 1297 characters wide. The uncompressed 2002 file is about 1.3 Gb and the 2003 file is almost 5 Gb! If your compression software has a 2 Gb limit, it won't work. Try other software such as WinRAR.
To report errors, or if you have comments or suggestions, an interest in SAS library files for the later data, e-mail email@example.com