log using match_one_year, text replace set more off set memory 500m program define aef2 use /homes/data/morg/annual/morg`2' local latest_year=2003 **Keep above this line * Person Match display "Person Match" display "time is $S_TIME" *mym is the year and month of the file where matching observations would be display "Generate mym for minsamp 4" generate int mym = `1' * 10 + 18 if minsamp==4 display "Generate mym for minsamp 8" replace mym =`1' * 10 -6 if minsamp==8 label var mym "Match year and month-in-sample" display "Generate id" gen id = _n display "Generate Sorting variables" display "Sort vars" local sort mym intmonth state hhid hhnum local sortnf `sort' lineno local sortf `sort' famnum lineno local sort94 `sort' famnum lineno serial if ( `1' <= 1984 ) { local sort `sortnf' } if ( `1' > 1984 & `1' <= 1994 ) { local sort `sortf' } if ( `1' > 1994 ) { local sort `sort94' } display "sort `sort'" sort `sort' id display "Household/family-level match" ** WARNING: This merge will generate extra observations ** if obs with duplicate merge variables aren't eliminated . ** See http://www.stata.com/support/faqs/data/merge.html local prevyear=`1'-1 ** Matching is not possible due to sample redesigns between ** Jul to Dec 1984 4s to their 1985 8s ** Jan to Sep 1985 4s to their 1986 8s ** Jun to Dec 1994 4s to their 1995 8s ** Jan to Aug 1995 4s to their 1996 8s display "Match minsamp 8 households to minsamp 4s" **( 79:8s don't have matches ) if (`1' > 1979 ) { capture describe scalar obs_pre = r(N) display obs_pre " observations prior to the household match" *** match files are created using matchYYYY.do display "Merging `1' 8s to 4s using match`prevyear'4.dta " by `sort': gen dup = _n tab dup merge `sort' using /homes/data/morg/match/match`1'.dta display "Drop unmatched obs from using data" drop if _merge==2 capture describe sort `sort' id by `sort' id : gen dup4 = _n tab dup4 drop if dup4 > 1 tab dup dup4 scalar obs_post = r(N) display obs_post " observations after household match " assert obs_pre == obs_post drop _merge } ** Create dummy variables to verify that sex, race, and age match. ** A value of 1 indicates a match. Zero means no match. gen byte sexdif = sex == msex gen byte racedif = race == mrace ** Fix race coding scheme for 88:4 and 89:8 if ( `1'==1988 & minsamp == 4 ) { replace racedif=1 if race==3 & mrace>3 & mrace<. } else if ( `1'==1989 & minsamp == 8 ) { replace racedif=1 if race>3 & race<. & mrace==3 } ** In 2003, greatly expanded race categories were used. ** Over 98% of 2003:8 chose one race category. tab mrace if `1'==2003 & minsamp==8 & race > 3 & race < . gen byte age_mage=age-mage gen byte agedif=(age_mage>=-1 & age_mage==3 ) gen byte match=0 replace match=1 if sexdif+racedif+agedif==3 ** Make long personid string tostring intmonth, gen( pid_intmonth ) format( %02.0f ) format hhid %015s tostring lineno, gen( pid_lineno ) format( %02.0f ) ** Race and sex are the same, but the matching age should be included tostring mage, gen( pid_mage ) format( %02.0f ) replace pid_mage = "99" if mage==. ** Match variables ** Create match variables for each time period ** A '_428' variable is used to match minsamp 4 to minsamp 8 ** An '_824' variable is used to match minsamp 4 to minsamp 8 local match_428 year minsamp pid_intmonth state hhid hhnum local match_824 mym pid_intmonth state hhid hhnum local mid79_428 `match_428' pid_lineno sex race age pid_mage local mid79_824 `match_824' pid_lineno sex race pid_mage age local mid85_428 `match_428' famnum pid_lineno sex race pid_age pid_mage local mid85_824 `match_824' famnum pid_lineno sex race pid_mage age local mid89_824 `match_824' famnum pid_lineno sex pid_race pid_mage age local mid94_428 `match_428' famnum pid_lineno pid_ser sex race age pid_mage local mid94_824 `match_824' famnum pid_lineno pid_ser sex race pid_mage age ** Pre-1984 matches ( includes 84:8 which matches to 83:4 ) if (`1' < 1984 | ( `1' == 1984 & minsamp==8 ) ) { local match_428 `match_428' `mid79_428' display "`2' match_428 is " `"`match_428'"' local match_824 `match_824' `mid79_824' display "`2' match_824 is " `"`match_824'"' } if (`1' == 1984 & minsamp==4 ) { local match_428 `mid85_428' } if ((`1'>1984 & `1'<1989)|( `1'==1989 & minsamp==8 )|(`1'>1989 & `1'<=1994)) { local match_428 `match_428' `mid85_428' local match_824 `match_824' `mid85_824' } ** 88:4's have a single "3 = Other" race code. Their matches, ** 89:8's, split into three race codes, American Indian, API, and Other if (`1' == 1989 & minsamp==8 ) { gen byte pid_race = race replace pid_race=3 if minsamp==8 & race>3 & race<. & mrace==3 local match_824 `match_824' `mid79_824' } ** Serial suffix (extra unit id) became available in 1994 for matching if (`1' > 1994 ) { ** Make uppercase and remove leading and trailing blanks replace serial = upper(trim( serial )) ** Make numeric version of serial egen pid_serial = group( serial ) ** A=2 rather than 1 b/c the missing value -1 becomes 1. So, make A=1, etc. assert pid_ser == 2 if serial == "A" replace pid_ser = pid_ser - 1 local match_428 `match_428' `mid94_428' local match_824 `match_824' `mid94_824` } egen str35 match_428 = concat(`match_428') egen str35 match_824= concat(`match_824') gen matchid = match_428 replace matchid = match_824 if minsamp == 8 drop pid_* display "time is $S_TIME" end *aef2 1979 79 79_83 *aef2 1980 80 79_83 *aef2 1981 81 79_83 *aef2 1982 82 79_83 aef2 1983 83 79_83 *aef2 1984 84 84_88 *aef2 1985 85 84_88 *aef2 1986 86 84_88 *aef2 1987 87 84_88 *aef2 1988 88 84_88 *aef2 1989 89 89_93 *aef2 1990 90 89_93 *aef2 1991 91 89_93 *aef2 1992 92 89_93 *aef2 1993 93 89_93 *aef2 1994 94 94_97 *aef2 1995 95 94_97 *aef2 1996 96 94_97 *aef2 1997 97 94_97 *aef2 1998 98 98 *aef2 1999 99 98 *aef2 2000 00 98 *aef2 2001 01 98 *aef2 2002 02 98