NCHS' Vital Statistics Natality Birth Data
Natality Data from the National Vital Statistics System of the National Center for Health Statistics provide demographic and health data for births occuring during the calendar year. The microdata are based on information abstracted from birth certificates filed in vital statistics offices of each State and District of Columbia.
Demographic data include variables such as date of birth, age and educational attainment of parents, marital status, live-birth order, race, sex, and geographic area. Health data include items such as birth weight, gestation, prenatal care, attendant at birth, and Apgar score. Geographic data includes state, county, city (available for cities of 250,000+ (up to 1980) and 100,000+ (1980-)), SMSA (1980-), and metropolitan and nonmetropolitan counties.
Population files (such as natpop91.dat.Z) contain the population counts for U.S. women 15-44, those traditionally thought to be "at risk" for giving birth. The files have 2448 lines. Each line represents the count of one combination of 51 state x 6 age x 4 race x 2 Hispanic origin of mother categories. These files are available for 1991 on. Population files are not available for the U.S. Territories.
SEER provides helpful U.S. Population data for 1969 on.
1981 is the first file with both FIPS and NCHS county codes. An NCHS to FIPS state, county, and MSA crosswalk is available.
Variable layouts are basically the same for periods 1972-1977, 1979-1981, 1982-1983, 1984-1985, and 1992-1994 though a few codes change across years.
Thanks to Michael Greenstone and Kenneth Chay for the 1975-1985 data.
Source file size: The compressed 1968-1985 files are between 50 and 130 Mb, and the compressed 1991-1994, and 1998-2002 files are 120-155 Mb. The 2003 file is over 200Mb. The compression ratio for these files is over 90%.
Because of the large size of the complete collection, we would prefer that you not download large fractions over the web. NBER internal users can obtain the data from a UNIX shell at /homes/data/natality or on an NBER PC via Network Neighborhood --> NBER --> home --> data --> natality
* The 2003 datafile is nearly four times larger than files from previous years. This is because while the 2002 file is 352 characters wide, the 2003 file is 1297 characters wide. The uncompressed 2002 file is about 1.3 Gb and the 2003 file is almost 5 Gb! If your compression software has a 2 Gb limit, it won't work. Try other software such as WinRAR.
To report errors, or if you have comments or suggestions, an interest in SAS library files for the later data, e-mail firstname.lastname@example.org