NCHS's Vital Statistics Natality Birth Data -- 1968-2012

Natality Data from the National Vital Statistics System of the National Center for Health Statistics provide demographic and health data for births occuring during the calendar year. The microdata are based on information abstracted from birth certificates filed in vital statistics offices of each State and District of Columbia.

Other available birth data are Birth Cohort Linked Birth/Infant Death Data , Period Linked Birth/Infant Death Data from the Perinatal Mortality Data, and Matched Multiple Birth Data.

By using this data you signify your agreement with NCHS's data use rules. Works referring to the datasets or codebooks should contain a citation to NCHS. Published material derived from this data should include a citation such as this at the bottom of the table: "Source: National Center for Health Statistics (span of years used)"

Prior to 1972, data are based on a 50-percent sample of birth certificates from all States. Beginning in 1972, data are based on a 100-percent sample of birth certificates from some states and on a 50-percent sample from the remaining States. The number of States from which 100 percent of the records are used has increased from 6 in 1972 to all States and the District of Columbia in 1985. Birth data from the U.S. Territories Guam, Puerto Rico, and the U.S. Virgin Islands are available on a separate file beginning in 1994. In 1998, American Samoa and the Northern Marianas were added to the U.S. Territories files.

Demographic data include variables such as date of birth, age and educational attainment of parents, marital status, live-birth order, race, sex, and geographic area. Health data include items such as birth weight, gestation, prenatal care, attendant at birth, and Apgar score. Geographic data includes state, county, city (available for cities of 250,000+ (up to 1980) and 100,000+ (1980-)), SMSA (1980-), and metropolitan and nonmetropolitan counties.

Population files (such as natpop91.dat.Z) contain the population counts for U.S. women 15-44, those traditionally thought to be "at risk" for giving birth. The files have 2448 lines. Each line represents the count of one combination of 51 state x 6 age x 4 race x 2 Hispanic origin of mother categories. These files are available for 1991 on. Population files are not available for the U.S. Territories.

SEER provides helpful U.S. Population data for 1969 on.

Both ".Z" and ".zip" files can be uncompressed with winzip. In addition, ".Z" files can be uncompressed using the UNIX uncompress command and ".zip" files can be unzipped with pkunzip.

To check your ability to uncompress these files, download the small files compress.Z or These files give an example of how to read in .Z and .zip ASCII files into SAS for UNIX without decompressing the files. To download files in Internet Explorer, right click on them and select "Save Target As...". If the pdf documents appear to be all blank pages, get the latest Acrobat Reader at

Variable layouts are basically the same for periods 1972-1977, 1979-1981, 1982-1983, 1984-1985, and 1992-1994 though a few codes change across years.

Thanks to Michael Greenstone and Kenneth Chay for the 1975-1985 data.

Raw file size: The compressed 1968-1985 files are between 50 and 130 Mb, and the compressed 1991-1994, and 1998-2002 files are 120-155 Mb. The 2003 file is over 200Mb. The compression ratio for these files is over 90%.

Because of the large size of the complete collection, we would prefer that you not download large fractions over the web. NBER internal users can obtain the data from a UNIX shell at /homes/data/natality or on an NBER PC via Network Neighborhood --> NBER --> home --> data --> natality

Updates and changes.

United States -- Data & Documentation 1968-2012
Birth Data
SAS Code
Stata Code
SPSS Code Documentation
Pkzipped Stata .do .dct
1968 natl1968.dct natl1968.sps natl1968.pdf
1969 natl1969.dct natl1969.sps natl1969-1971.pdf
1970 natl1970.dct natl1970.sps
1971 natl1971.dct natl1971.sps
1972 natl1972.dct natl1972.sps natl1972-1977.pdf
1973 natl1973.dct natl1973.sps
1974 natl1974.dct natl1974.sps
1975 natl1975.dct natl1975.sps
1976 natl1976.dct natl1976.sps
1977 natl1977.dct natl1977.sps
1978 natl1978.dct natl1978.sps natl1978.pdf
1979 natl1979.dct natl1979.sps natl1979.pdf
1980 natl1980.dct natl1980.sps natl1980.pdf
1981 natl1981.dct natl1981.sps natl1981.pdf
1982 natl1982.dct natl1982.sps natl1982.pdf
1983 natl1983.dct natl1983.sps natl1983.pdf
1984 natl1984.dct natl1984.sps natl1984.pdf
1985 natl1985.dct natl1985.sps natl1985.pdf
1986 natl1986.dct natl1986.sps natl1986.pdf
1987 natl1987.dct natl1987.sps natl1987.pdf
1988 natl1988.dct natl1988.sps natl1988.pdf
1989 natl1989.dct natl1989.sps natl1989.pdf
1990 natl1990.dct natl1990.sps natl1990.pdf
1991 natl1991.dct natl1991.sps natl1991.pdf
1992 natl1992.dct natl1992.sps natl1992.pdf
1993 natl1993.dct natl1993.sps natl1993.pdf
1994 natl1994.dct natl1994.sps natl1994.pdf
1995 natl1995.dct natl1995.sps natl1995.pdf
1996 natl1996.dct natl1996.sps natl1996.pdf
1997 natl1997.dct natl1997.sps natl1997.pdf
1998 natl1998.dct natl1998.sps natl1998.pdf
1999 natl1999.dct natl1999.sps natl1999.pdf
2000 natl2000.dct natl2000.sps natl2000.pdf
2001 natl2001.dct natl2001.sps natl2001.pdf
2002 natl2002.dct natl2002.sps natl2002.pdf
The 2003 datafile is nearly four times larger than the 2002 file. This is because while the 2002 file is 352 characters wide, the 2003 file is 1297 characters wide.
The uncompressed 2002 file is about 1.3 Gb and the 2003 file is almost 5 Gb!  Old compression software with a 2 Gb limit won't work.
2003 natl2003.dct natl2003.sps natl2003.pdf
2004 natl2004.dct natl2004.sps natl2004.pdf
The 2005 public use data from 2005-on does not include geographic detail due to restrictions imposed by the states. This means that the 2005-on data does not include any geographic variables such as state, county, msa, etc. has select tables, and   has information on requesting restricted versions of the data which include geographic identifiers, etc.
2005 natl2005.dct natl2005.sps natl2005.pdf
2006 natl2006.dct natl2006.sps natl2006.pdf
2007 natl2007.dct natl2007.sps natl2007.pdf
2008 natl2008.dct natl2008.sps natl2008.pdf
2009 natl2009.dct natl2009.sps natl2009.pdf
2010 natl2010.dct natl2010.sps natl2010.pdf
2011 natl2011.dct natl2011.sps natl2011.pdf
2012 natl2012.dct natl2012.sps natl2012.pdf
* The 2003 datafile is nearly four times larger than files from previous years.   This is because while the 2002 file is 352 characters wide, the 2003 file is 1297 characters wide. The uncompressed 2002 file is about 1.3 Gb and the 2003 file is almost 5 Gb! If your compression software has a 2 Gb limit, it won't work.   Try other software such as WinRAR.

U.S. Territories Data, 1994-2012
Births Data
Codes for Reading Raw ASCII Data
ASCII (.zip) SAS (.zip) Stata (.zip) SAS SPSS Stata  .do Stata  .dct
1994 t94.sps t94.dct
1995 t95.sps t95.dct
1996 t96.sps t96.dct
1997 t97.sps t97.dct
1998 t98.sps t98.dct
1999 t99.sps t99.dct
2000 t00.sps t00.dct
2001 t01.sps t01.dct
2002 t02.sps t02.dct
2003 t03.sps t03.dct
2004 t04.sps t04.dct
2005 t05.sps t05.dct
2006 t06.sps t06.dct
2007 t07.sps t07.dct
2008 t08.sps t08.dct
2009 t09.sps t09.dct
2010 t10.sps t10.dct
2011 t11.sps t11.dct
2012 t12.sps t12.dct
Report on Final Natality Statistics FR1994 FR1995 FR1996 FR1997 FR1998 FR1999 FR2000 FR2001 FR2002
Standard Birth Certificates
sbc68-77   sbc78-88   sbc89-02   sbc03

To report errors, or if you have comments or suggestions, an interest in SAS library files for the later data, e-mail

Last Update: February 6, 2014 Created by Jean Roth September 15, 2000


National Bureau of Economic Research, 1050 Massachusetts Ave., Cambridge, MA 02138; 617-868-3900; email:

Contact Us