National Center for Health Statistics

Edward J. Sondik, Ph.D., Director

Jack R. Anderson, Deputy Director

Jack R. Anderson, Acting Associate Director for International Statistics

Lester R. Curtin, Ph.D., Acting Associate Director for Research and

Methodology

Jennifer H. Madans, Ph.D., Acting Associate Director for Analysis,

Epidemiology, and Health Promotion

P. Douglas Williams, Acting Associate Director for Data Standards,

Program Development, and Extramural Programs

Edward L. Hunter, Associate Director for Planning, Budget and

Legislation

Jennifer H. Madans, Ph.D., Acting Associate Director for Vital and

Health Statistics Systems

Douglas Zinn, Acting Associate Director for Management

Charles J. Rothwell, Associate Director for Data Processing and

Services

Division of Data Services

Philip R. Beattie, M.S.P.H., Director

Margot Palmer, Deputy Director

Division of Health and Utilization Analysis

Diane M. Makuc, Dr.P.H., Director

Compressed Mortality File 1968-88 on CD-ROM

(CD-ROM Series 20, No. 2A ASCII Version)



Compressed Mortality File 1968-88





Introduction.................................................................................................................................1



NCHS Data Use Agreement........................................................................................................2



Files on CD-ROM.......................................................................................................................3



Description of Mortality File and File Layout...............................................................................4



Description of Population File and File Layout.............................................................................8



Guidelines for Citation of Data..................................................................................................13

Introduction



The Compressed Mortality File 1968-88 (CMF 1968-88) is a county-level mortality and population data file for the United States spanning the years 1968-88. The file permits the calculation of national, state, and county death rates for race-sex-age groups of interest. The mortality file contains only a select set of key analysis variables, namely, 1) state and county of residence, 2) year of death (rather than the full date of death), 3) race (recoded to white, black, other races), 4) sex, 5) age group at death (specific age recoded to 16 age groups), 6) underlying cause-of-death (4-digit ICD code), and 7) 69 or 72 cause-of-death recode. The national, state, and county population estimates on the CMF are from the Bureau of the Census. The age, race, and sex detail of the population file matches that of the mortality file.

Details of the data use restrictions are given in NCHS Data Use Agreement (see page 2).

Further detail about the mortality and population files on the CMF can be found in the Documentation.

NCHS Data Use Agreement



The Public Health Service Act (Section 308) (d) provides that the data

collected by the National Center for Health Statistics (NCHS), Centers for

Disease Control and Prevention (CDC), may be used only for the purpose of

health statistical reporting and analysis.



Any effort to determine the identity of any reported case is prohibited

by this law.



NCHS does all it can to assure that the identity of data subjects cannot

be disclosed. All direct identifiers, as well as any characteristics that

might lead to identification, are omitted from the dataset. Any

intentional identification or disclosure of a person or establishment

violates the assurances of confidentiality given to the providers of the

information. Therefore, users will:



1. Use the data in these datasets for statistical reporting and analysis only.



2. Make no use of the identity of any person or establishment discovered

inadvertently and advise the Director, NCHS, of any such discovery.



3. Not link these datasets with individually identifiable data from other

NCHS or non-NCHS datasets.





Files on CD-ROM



README.WPD This file is in WordPerfect version 6.1 format. This file includes general descriptions of the mortality and population files on the Compressed Mortality File 1968-88 and the file layouts.



DOCUMENT.PDF This file is in PDF format. It contains the file documentation for the CMF for the period 1968-88. The file contains the NCHS Data Use Agreement, descriptions and record layouts for the mortality and populatino data files, detailed information about the mortality and population data, cause-of-death coding, computation of death rates, and a dictionary of the FIPS state and county codes and names.



SASCODE.TXT This file is an ASCII text format. The file provides sample PC SAS programs for creating a format library, a mortality file, and a population file from the data files on the CD-ROM.



Data Files



MORT6878 The mortality data file for 1968-78

MORT7988 The mortality data file for 1979-88



POP6878 The population data file for 1968-78



POP7988 The population data file for 1979-88



Description of the Mortality Files



The mortality data, for all years except 1972, are based on records for all deaths occurring in the United States. For 1972, the data are based on a 50 percent sample and weighted by a factor of 2. Deaths to foreign residents are excluded. Deaths to U.S. residents who died abroad are not included on this file. Appendix A in the Documentation provides a description of the vital statistics reporting system maintained by the NCHS.



The source records were condensed to 23-bytes by retaining only a select set of key analysis variables. The variables included on the condensed record are: 1) state and county of residence, 2) year of death (rather than the full date of death), 3) race (recoded to white, black, other races), 4) sex, 5) age group at death (specific age recoded to 16 age groups), 6) underlying cause-of-death (4-digit ICD code), and 7) 69 or 72 cause-of-death recode.



Including only these few variables on the file and recoding some of them into a limited number of categories resulted in numerous records having identical values on all of the variables. The number of records on the file was reduced substantially by aggregating records with identical values on all of the variables into one record. A count indicating the number of identical records was added to the aggregate record. For example, two white male residents of Clay County, Alabama, with ages between 35 and 44 years, died from "bronchus and lung, unspecified" (ICD 162.9) in 1979. Their records were combined into one, with a 2 in the count field. Note that there are no records on the file with zero in the count field. If no deaths occurred for a particular combination of variable values, no record appears.



Specific details



1. Underlying cause-of-death for the years 1968-78 is classified in accordance with the Eighth Revision International Classification of Diseases, Adapted for Use in the United States (ICDA-8) codes. Cause-of-death for the years 1979-88 is classified in accordance with the International Classification of Disease, Ninth Revision (ICD-9) codes. For a further description of the ICD codes see Appendix B in the Documentation or Volume II of the annual mortality volumes produced by the NCHS, such as Vital Statistics of the United States, 1978, Volume II-Mortality, Part A,or Vital Statistics of the United States, 1988, Volume II-Mortality, Part A. For a list of comparable ICD codes for the 8th and 9th revisions and estimated comparability ratios, see Appendix B in the Documentation.



2. The fourth digit of the ICD code can assume the values 0-9 and blank. If the fourth digit is a "blank", it is a blank on this file. Care must be taken when reading the file to distinguish between blanks and zeros.



3. For injuries and poisonings, the external cause is coded (E800-E999) rather than the Nature of Injury (800-999). The letter "E" is not included in the code.



4. For 1988, if there were three or fewer deaths for a given Georgia county of residence (of deaths occurring in Georgia) with HIV infection (ICD codes *042-*044, 796.8) cited as a cause-of-death (underlying or non-underlying cause), these records were assigned a "missing" place of residence code (FIPS code = 13999).



5. The FIPS state and county codes contain leading zeros in both the 2-byte state code and the 3-byte county code.



File Specifications for the Mortality Files



File names Years Number of records Record Length Format


MORT6878 1968-78 8,774,864 23 ASCII

MORT7988 1979-88 16,448,435 23 ASCII


The files are sorted by locations 6-9, 1-5, 10, 11-12, 13-16.


Field Item and

Location Size Code Outline Format


FIPS Codes

(See Appendices E and F in the Documentation)

1-2 2 FIPS state code Numeric

3-5 3 FIPS county code Numeric



6-9 4 Year of death Numeric



10 1 Race-sex Numeric



1 White male

2 White female

3 Black male

4 Black female

5 Other male

6 Other female



11-12 2 Age at death Numeric



01 under 1 day

02 1-6 days

03 7-27 days

04 28-364 days

05 1-4 years

06 5-9 years

07 10-14 years

08 15-19 years

09 20-24 years

10 25-34 years

11 35-44 years


Field Item and

Location Size Code Outline Format


12 45-54 years

13 55-64 years

14 65-74 years

15 75-84 years

16 85+ years

99 Unknown



13-16 4 ICD code for underlying

cause-of-death Numeric



1968-78: ICDA-8

1979-88: ICD-9



17-19 3 Cause-of-Death Recode Numeric

(See Appendix B in the Documentation)



1968-78: 69 Cause-of-Death Recode

1979-88: 72 Cause-of-Death Recode



20-23 4 Number of deaths Numeric







Description of the Population File



There are national, state, and county population estimates on the population file of the CMF. The population estimates are based on U.S. Bureau of the Census estimates of U.S. national, state, and county resident populations. The 1968-69 national estimates and all of the estimates for 1971-79 and 1981-88 are intercensal estimates of July 1 resident populations. The 1970 and 1980 population estimates are April 1 modified (modified age-race-sex) census counts. The 1968 and 1969 state and county population estimates were calculated by NCHS using linear extrapolation. A brief description of the population estimates is provided here; a more detailed description is provided in Appendix D in the Documentation.



Specific details



1. There is one record on the file for each geographic unit (total U.S., state, county) x year x race-sex group.



2. Modifications of the population estimates made by NCHS:

a. To permit the calculation of infant mortality rates, NCHS live-birth data were substituted for the estimates of the population under one year of age. The race code for these records is derived from "race of mother".



b. When the age group 1-4 years did not appear on the Census file, the age group 0-4 years was multiplied by 0.8 to obtain an estimate of the population 1-4 years.



c. For non-censal years prior to 1992, the NCHS Division of Vital Statistics uses national population estimates rounded to the nearest 1,000 to calculate published death rates. On the CMF, the national population estimates for 1968-69 and 1971-79 are rounded to the nearest 1,000 in accordance with this practice. However, this means that calculation of rates for aggregate age, race, and/or sex groups involves using population estimates that were rounded before aggregation rather than after aggregation. As a result, national death rates for aggregate groups calculated using the rounded estimates on the CMF may differ slightly from those published by NCHS. The national population estimates for 1981-88 on the CMF are not rounded so that the user can round them after aggregating across subgroups and avoid the rounding error problem.



3. National, state, and county population estimates can be identified by using the FIPS code or the record type variable in location 140. National population records have a FIPS code of "00000". State population records have a valid 2-digit FIPS state code and a county code of "000" (see Appendix E in the Documentation). The record type variable assumes the value "1" for national records, "2" for state records, and "3" for county records.

It is necessary to provide separate sets of estimates for each geographic level because the methodology used to produce the intercensal estimates (1971-79 and 1981-88) did not smooth them sufficiently. Thus, for the intercensal years, the sum of the population estimates of counties within a state may not equal the state population estimate, and the sum of all state population estimates or all county population estimates may not equal the national population estimates. For these years, the national population estimates should be used when calculating national death rates and the state population estimates should be used when calculating state death rates.

4. The FIPS state and county codes contain leading zeros in both the 2-byte state code and the 3-byte county code.



5. For 1988, there was an additional county in Georgia with a "missing" county code of "999". The six records for this county have population counts of zero.



6. Brief description of population estimates for individual years



1968-69 population estimates - National population estimates are U.S. Bureau of the Census intercensal estimates of the July 1 resident population. State and county population estimates were calculated by NCHS using linear extrapolation from the corresponding July 1, 1970 and July 1, 1971 estimates.



1970 population estimates - National, state, and county population estimates are from a modified version of the April 1, 1970 census. The original census counts were modified by the U.S. Bureau of the Census to correct: 1) errors discovered in the data, 2) race misclassification - persons of Hispanic origin who reported their race as "other" were recoded as "white".

1971-79 population estimates - National and county estimates are U.S. Bureau of the Census intercensal estimates of the July 1 resident population. The Bureau of the Census did not produce state population estimates by age, race, and sex for the 70's. Therefore, the state population estimates for 1971-79 on this file are simply the sum of the population estimates for the counties in each state.

Three Virginia independent cities (Manassas, Manassas Park, and Poquoson) did not appear on the Census file prior to 1981. While these independent cities are not on the mortality file for 1968-78, they are on the file for 1979 onwards. Therefore, the 1979 populations for these three cities were estimated from the July 1, 1980 and July 1, 1981 estimates of these cities. The 1979 population estimates for the counties containing the cities were reduced by the estimated city populations.



1980 population estimates - National, state, and county population estimates are from a modified version of the April 1, 1980 census. The original census counts were modified by the U.S. Bureau of the Census: 1) persons who reported their race as "other" (the majority being of Hispanic origin) were reassigned to one of the official race groups, 2) an adjustment was made for the overcount of centenarians

April 1, 1980 population estimates for three Virginia independent cities, (Manassas, Manassas Park, and Poquoson) had to be extrapolated from July 1, 1980 estimates. The April 1 populations for the three cities were calculated as a proportion of the April 1 county population, with the proportion obtained from the July 1, 1980 city/county estimates. The April 1 population estimates for the counties containing the three cities were reduced by the estimated April 1 city populations.



1981-88 population estimates - National, state, and county estimates are U.S. Bureau of the Census intercensal estimates of the July 1 resident population.



File Specifications for the Population Files



File name Years Number of records Record length Format


POP6878 1968-78 206,712 140 ASCII

POP7988 1979-88 189,966 140 ASCII


The files are sorted by locations 6-9, 1-5, 10.


Field Item and

Location Size Code Outline Format


FIPS codes

(See Appendices E and F)

1-2 2 FIPS state code Numeric

3-5 3 FIPS county code Numeric



6-9 4 Year Numeric

10 1 Race-sex Numeric



1 White male

2 White female

3 Black male

4 Black female

5 Other male

6 Other female

11-18 8 Number of live births Numeric

19-26 8 Population in age group: 1-4 years Numeric



27-34 8 Population in age group: 5-9 years Numeric



35-42 8 Population in age group: 10-14 years Numeric



43-50 8 Population in age group: 15-19 years Numeric



51-58 8 Population in age group: 20-24 years Numeric



59-66 8 Population in age group: 25-34 years Numeric



67-74 8 Population in age group: 35-44 years Numeric


Field Item and

Location Size Code Outline Format


75-82 8 Population in age group: 45-54 years Numeric

83-90 8 Population in age group: 55-64 years Numeric



91-98 8 Population in age group: 65-74 years Numeric



99-106 8 Population in age group: 75-84 years Numeric



107-114 8 Population in age group: 85+ years Numeric



115-139 25 County name Character

(See Appendix F in the Documentation)



140 1 Record type Numeric



1 National population record

2 State population record

3 County population record

Guidelines for Citation of Data



With the goal of mutual benefit, the National Center for Health Statistics

(NCHS) requests that recipients of data files cooperate in certain actions

related to their use. Any published material derived from the data should

acknowledge NCHS as the original source. The suggested citation to appear

at the bottom of all tables is as follows:



Source: National Center for Health Statistics (span of years used)



When cited in a bibliography, the citation should read:



National Center for Health Statistics (2000). Data File

Documentation, Compressed Mortality File, 1968-88 (machine

readable data file and documentation, CD-ROM Series 20, No. 2A),

National Center for Health Statistics, Hyattsville, Maryland.



The published material should also include a disclaimer that credits any

analyses, interpretations, or conclusions reached to the author (recipient of

the data file) and not to NCHS, which is responsible only for the initial

data. Consumers who wish to publish a technical description of the data

should make an effort to insure that the description is not inconsistent with

that published by NCHS.