* Note: all.sas and test.sas are not available. ; /* CPS Guide This documentation provides an explanation of several programs written to adapt the March CPS extract data prepared by Mare-Winship. These programs are used for such applications as matching couple data within the same year as well as matching household and individual data for two consecutive years. MATCHING HUSBANDS AND WIVES The all.sas program organizes and merges the Mare-Winship CPS data to create a new data set that matches husband and wife data. Only married individuals, determined by a value of 1, 2 or 3 for variable x60- Marital Status, are included in the new data set. Thi data set containing records for married individuals are then separated into two new data sets by variable x76- Sex, 1 for males and 2 for females. Each of the personal variables for all the records in both data sets are renamed (i.e variable x33- Age is renamed m33 for each male record and f33 for each female record). The two data sets are then merged by variables x1- Household Number and either x2 or x12, both variables measuring the Family Number Within a Household, to create one data set with matching husband and wife records. (The use of x2 or x12 depends upon the year of the data set. Both variables have missing values for certain years, but together the two represent the entire time series.) Each new couple record contains one value for each of the household and family variables and two values, one for the male data and one for the female data, for each of the personal variables. The test.sas program provides a sample application of the all.sas matched couple data set. This program runs a PROC FREQ on the couple records using variable x63- Normal Full Time Work. The resulting table shows all the working/non-working combinations among couples and the corresponding percentages of each situation for the given year. MARCH-MARCH HOUSEHOLD MATCHES The match.sas program matches March CPS data by household for any two consecutive years. Each of the household variables for all the records in year one are renamed by replacing x- with y1- (i.e. x1- Household Counter becomes y11), with a similar replace of x- with y2- in year two. The two data sets are merged by variables x7- Random Cluster Code and x10- Family Serial Number to create a new data set containing matched household records for two consecutive years (match only valid from 1968 to 1975 due to missing data for the merge variables during the rest of the time series). MARCH-MARCH INDIVIDUAL MATCHES The IndMatch.sas program matches individual data for heads of households and spouses of heads, determined by a value of 1 or 2 for variable x53- Household Recode III, by merging two data sets, representing two consecutive years, by variable x7- Random Cluster Code, x10- Family Serial Number, x33- Age, x72- Race, and x76- Sex (match valid from 1968-1975 due to missing data). The personal variables were renamed to represent year one and year two in the same manner as in creating the match.sas data set. QUALITY OF MATCH TESTS Several tests were run to determine the quality of the match of the individual records. The following variables were chosen as possible indicators of a poor match of individual records. Each of the chosen variables has an expected value from one year to the next. An unexpected change in year two in the values of any of these variables would be a sign of a possible poor match among the individual records. Code Variable Name Expected Value, Year 2 ---- ------------- ---------------------- x33 Age Increase of one year x38 Current Industry No change or a logical move to a similar industry x39 Current Occupation No change or a logical move to a similar occupation x49 High Grade Attended No change or an increase of one higher grade attended x79 Veteran Status No change, or a logical change in status PROC FREQ tables were generated for combinations of these chosen variables. The resulting tables were analyzed, with the results reported below, to determine if the results corresponded to hypothesized results of the the tests. The quality of match tests were done for the time series 1969-1970. In looking at the individual PROC FREQ tables for the variables x38- Current Industry and x39- Current Occupation, nearly three-quarters of the individuals in the data set maintained the same occupation or industry from 1969 to 1970. The PROC FREQ table that compares the change among industry and occupation for individuals of the expected age in 1970 (age1969+1) also shows a positive indication of a good match with nearly 69% of the individuals remaining in the same industry and having the same occupation for the two years. It is important to note, however, that a change among occupation and/or industry is not necessarily an indication of a bad match. Some changes are logical moves to a similar occupation and/or industry. Each applicable record would have to be analyzed in order to determine a the possibility of a poor match. Only a small number of the matched individual records reported a change in Veteran Status from 1969 to 1970, with 26601 of the 26847 total matched records maintaining the same status. This result is plausible when considering that the Vietnam War falls within the time period in question. Again, each applicable record would have to be analyzed to determine a logical change in Veteran Status. The change among the year of the highest grade attended also was small, with 25983 of the 26847 total matched records reported the same level of education for the two years. When matching individuals by age, the IndMatch.sas program allowed for a match of plus-or-minus one year from the age that was expected in the second year. Over 94% of the matched individuals in this time period reported an age in the second year of a year older than the age reported in the previous year. When the age variable was run in the PROC FREQ tables with the other bad indicator variables, the highest percentages from the possible combinations always were reported for the situation representing the expected age (age1970=age1969+1) and no change among the other chosen variable. In an attempt to explain the any change within the make-up of families or households, the Indiv.sas program was written to report the combinations of ages reported by matched couple data for two consecutive years. The table shows that of all the matched couples, 57 percent reported an increase of one year in age from 1969 to 1970 for both the head and the spouse. 21 percent of the couples both reported an age outside of the range allowed for a match, which can be explained by new families within a household in 1970. The remaining 22 percent is made up of various combinations of one member of the household being plus-or-minus one year of age from the expected age in 1970 compared to the same for the other spouse with a small combination of the results showing one spouse within the allowed range of age matches and the other spouse outside the range, probably representing remarrigaes. The results of the quality of match tests indicate that the matches of records where individuals reported the expected age (year2=year1+1) produce the best results. Only a small percentage of the total number of matched records report an age in the second year equal to that of the first year or two years greater than the age in the first year. The results show that, of these individuals, a larger portion of the total has unexpected changes among the other variables (i.e. industry, occupation, industry). These results suggest that the matches involving individuals with reported ages in year two other than what is expected have a higher likelihood of being a poor match. MISSING DATA A PROC MEANS was done in all records for each year in the Match CPS extract files to determine any years which a variable may be missing data: Code Variable Name Years Missing ---- ------------- ------------- x2 Family-in-Household 83, 87, 88 x4 Year 90 x7 Random Cluster 77-92 x8 Keyfitz Cluster 77-92 x9 Noninterview Cluster 64-67, 77-92 x10 Family Serial Number 64-67, 80-88 x11 Family Description 89-92 x12 Family Position in Household 68-75 x13 Family Type C-recipiency 64, 65, 89-92 x15 Number of Persons in Family 89-92 x20 Household Serial/Segment Number 64-67, 89-92 x21 Household Type 64-67 x22 Household Status 64-76 x23 Number of Families in Household 64-67 x26 SMSA 89-92 x27 SMSA-I 89-92 x32 ADC Recipiency 64, 65 x34 Alimony Recipiency 64-68 x35 Any Reason Could Not Take Job 64-67 x37 Complete High Grade Attended 92 x42 Family (Secondary) Membership 89-92 x43 Family Number 64-67, 89-92 x46 Farm/Self-Employed Income 76-79 x54 Last Work Full Time 64-67 x55 Last Work Full Time For Pay 64-67 x57 Look For Full or Part Time Work 64-67 x61 Nonfarm Self-Employment Income 76-79 x62 Normal Full Time Job 64-67, 89-92 x65 Parents Presence 64-75 x66 Person Sequence Number 64-67 x70 Public Assistance Amount 64-67 x71 Public Assistance Recipiency 64, 65 x73 Reason Not At Work Last Week 64-67 x77 Subfamily Membership Key 64-67, 89-92 x78 Unemployment Recipiency 64-68 x82 Weeks Looking for Work Last Year64-75, 89-92 x83 Weeks Looking for Work Last Year64-67 x84 Weeks Looking/Layed Off Work 64-75 x86 Weeks Worked Last Year-I 64-75 x87 Weeks Worked Last Year-II 89-92 x89 Why Look For Work 64-66 x91 Person Serial Number 64-67, 80-92 x92 Poverty Cutoff Dollars 64-67 x93 Poverty Level 64-67 x95 Spanish Ethnicity 64-70 x97 Main Reason For Part-Year Work 64-67, 89-92 x99 Stretches of Unemployment 64-67, 76-79 x100 Weeks in Labor Force 76-92 x101 Family A Weight 64-67, 77-92 x102 Family P Weight 64-67, 77-92 x103 Family Weight Basic 76 x104 Household Weight 64-76 x105 Person A Weight 64-67, 76-92 x106 Person P Weight 64-67, 76-92 x108 Basic CPS Weight 68-88 x109 Type-A-Income 64-67, 80-92 x110 Type-B-Income 64-67, 89-92 x111 Type-C-Income 64-67, 89-92 x112 Type-D-Income 64-67, 89-92 x113 Type-E-Income 64-67, 89-92 x114 Dividends and Interest 64-75 x115 Rental Income 64-75 x116 Public-Assistance Income 64-75 x117 Supplemental Security Income 64-75 x118 CPI-Index 89-92 x119 Version Number Major I.D. 64-92 x120 Version Number Minor I.D. 64-67, 80-92 x121 Presence of Own Children 68-92 x122 Own Chilren Under 6 (in family) 68-79 x123 Own Children Under 18 68-79 x124 Related Children Under 18 68-92 x125 Family Members Under 18 68-79 x126 Family Members Over 18 68-92 x127 Female Family Members 18+ 68-92 x128 Labor Force Status 68-92 x129 Household Flag 68-92 */