------------------------------------------------------------------------------ log: /bbkinghome/molitor/afink/baby boom/births_data_nber/4_births_dat > a-uncleaned_stata/clean_natality_1940_1968.log log type: text opened on: 12 Jun 2008, 12:15:50 . set more off . . ************************************************************************* . /* program to clean us vital statistics births data, 1940-1968 */ . . * original data: http://www.nber.org/vital-stats-books/ . * fields: county births by residence, by race*urbanicity*attendant as availa > ble . * data documentation: see natality_documentation.xls . * data entry: digital divide data (http://www.digitaldividedata.com/) . * data development generously funded by nia grant number p30 ag012810, throu > gh the nber . * nb: do not re-sort natality`year'.dta datasets before running this code . ************************************************************************* . . ** . *1940 data . ** . *http://nber15.nber.org/vital-stats-books/vsus_1940_2.cv.pdf . *table 7 . . *hand-checked excel file for ?'s (difficult-to-read characters) . *corrected below if any . . clear . use natality1940.dta . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page__of_p~_ | 4885 146.4178 16.12847 119 174 state | 0 county | 0 city_balan~l | 0 births__to~l | 4882 1235.222 6792.624 1 196088 -------------+-------------------------------------------------------- births_of_~t | 4877 724.3279 4677.438 1 170351 births_of_~0 | 4870 411.4166 2259.372 1 62216 births_of_~1 | 2877 160.9746 1086.948 1 25984 births_of_~2 | 2435 12.73183 83.92497 1 3357 . desc Contains data from natality1940.dta obs: 4,885 vars: 9 size: 586,200 (94.4% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page__of_pdf_ int %8.0g state str20 %20s county str52 %52s city_balance_~l str26 %26s births__total long %12.0g births_of_res~t long %12.0g births_of_res~0 long %12.0g births_of_res~1 int %8.0g births_of_res~2 int %8.0g ------------------------------------------------------------------------------ > - Sorted by: . replace county=lower(county) (4885 real changes made) . replace state=lower(state) (4885 real changes made) . replace city_balance_total=lower(city_balance_total) (4885 real changes made) . . *missing observations in the original data are zeros . *replace these so we can distinguish these from "true" missing values . *which would be generated below due to difficult-to-read characters . count if births__total==. 3 . replace births__total=0 if births__total==. (3 real changes made) . count if births_of_residents_of_area__att==. 8 . replace births_of_residents_of_area__att=0 if births_of_residents_of_area__a > tt==. (8 real changes made) . count if births_of_residents_of_area__at0==. 15 . replace births_of_residents_of_area__at0=0 if births_of_residents_of_area__a > t0==. (15 real changes made) . count if births_of_residents_of_area__at1==. 2008 . replace births_of_residents_of_area__at1=0 if births_of_residents_of_area__a > t1==. (2008 real changes made) . count if births_of_residents_of_area__at2==. 2450 . replace births_of_residents_of_area__at2=0 if births_of_residents_of_area__a > t2==. (2450 real changes made) . . *check that all pdf pages appear to be in the data . sort page__of_pdf_ . gen temp=page__of_pdf_[_n]-page__of_pdf_[_n-1] (1 missing value generated) . assert temp==0|temp==1|temp==. . drop temp . . *clean data entry errors . *none found as of yet . . *fix county name misspellings (relative to county names in the 1970 census) . replace county=subinstr(county, ".", "",.) (53 real changes made) . replace county=subinstr(county, "ste ", "st ",.) (1 real change made) . replace county="desoto" if county=="de soto" & state=="florida" (1 real change made) . replace county="cochrane" if county=="cochran" & state=="texas" (1 real change made) . replace county="davie" if county=="davis" & state=="north carolina" (1 real change made) . replace county="san francisco" if county=="san francisco, coextensive with s > an francisco (city)" & state=="california" (1 real change made) . replace county="orleans" if county=="orleans, coextensive with new orleans" > & state=="louisiana" (1 real change made) . replace county="denver" if county=="denver, coextensive with denver (city)" > & state=="colorado" (1 real change made) . replace county="philadelphia" if county=="philadelphia, coextensive with phi > ladelphia (city)" & state=="pennsylvania" (1 real change made) . replace county="baltimore city" if county=="baltimore (city)" & state=="mary > land" (1 real change made) . replace county="st louis city" if county=="st louis (city)" & state=="missou > ri" (1 real change made) . . *check that county names are consistent with 1970 census, except for known d > eviations . preserve . egen countyr_state=concat(county state), punct("_") . sort countyr_state . merge countyr_state using original_census1970_counties variable countyr_state does not uniquely identify observations in the master data variable countyr_state does not uniquely identify observations in original_census1970_counties.dta . assert _m==3|county=="armstrong" & state=="south dakota"|county=="washington > " & state=="south dakota"|county=="yellowstone national park (part)" & state > =="idaho"|county=="yellowstone national park" & state=="idaho"|county=="yell > owstone national park (part)" & state=="montana"|county=="yellowstone nation > al park (part)" & state=="wyoming"|county=="yellowstone national park (total > )" & state=="wyoming"|county=="yellowstone national park" & state=="montana" > |county=="park (excl yell nat park)" & state=="montana"|county=="yellowstone > national park, part" & state=="montana"|county=="yellowstone national park, > part" & state=="wyoming"|county=="yellowstone national park, part" & state= > ="montana"|county=="yellowstone national park, total" & state=="wyoming"|cou > nty=="park" & state=="montana"|county=="new york city" & state=="new york"|c > ounty=="bronx" & state=="new york"|county=="kings" & state=="new york"|count > y=="new york" & state=="new york"|county=="queens" & state=="new york"|count > y=="richmond" & state=="new york"|county=="ormsby" & state=="nevada"|county= > ="carson city" & state=="nevada"|county=="los alamos" & state=="new mexico"| > county=="menominee" & state=="wisconsin"|county=="total"|state=="alaska"|sta > te=="hawaii"|state=="virginia"|state=="dc" . restore . . *check for mistakes/misspellings in state names . assert state=="alabama"|state=="alaska"|state=="arizona"|state=="arkansas"|s > tate=="california"|state=="colorado"|state=="connecticut"|state=="delaware"| > state=="district of columbia"|state=="florida"|state=="georgia"|state=="hawa > ii"|state=="idaho"|state=="illinois"|state=="indiana"|state=="iowa"|state==" > kansas"|state=="kentucky"|state=="louisiana"|state=="maine"|state=="maryland > "|state=="massachusetts"|state=="michigan"|state=="minnesota"|state=="missis > sippi"|state=="missouri"|state=="montana"|state=="nebraska"|state=="nevada"| > state=="new hampshire"|state=="new jersey"|state=="new mexico"|state=="new y > ork"|state=="north carolina"|state=="north dakota"|state=="ohio"|state=="okl > ahoma"|state=="oregon"|state=="pennsylvania"|state=="rhode island"|state=="s > outh carolina"|state=="south dakota"|state=="tennessee"|state=="texas"|state > =="utah"|state=="vermont"|state=="virginia"|state=="washington"|state=="west > virginia"|state=="wisconsin"|state=="wyoming" . . *data checks for columns summing to county total . gen temp=births_of_residents_of_area__att+births_of_residents_of_area__at0+b > irths_of_residents_of_area__at1+births_of_residents_of_area__at2 . list if temp!=births__total +--------------------------------------------------------------+ 1285. | page__~_ | state | county | city_b~l | births~l | births~t | | 133 | iowa | mills | total | 219 | 78 | |--------------------------------------------------------------| | births~0 | births~1 | births~2 | temp | | 41 | 0 | 0 | 119 | +--------------------------------------------------------------+ +--------------------------------------------------------------+ 1372. | page__~_ | state | county | city_b~l | births~l | births~t | | 134 | kansas | edwards | total | 106 | 42 | |--------------------------------------------------------------| | births~0 | births~1 | births~2 | temp | | 83 | 0 | 1 | 126 | +--------------------------------------------------------------+ . *checked .pdf, these are data errors not data entry errors . drop temp . . *clean and label variables . rename page__of_pdf_ page_of_pdf . label var page_of_pdf "page of pdf" . label var state "state" . label var county "county" . rename city_balance_total sub_county . label var sub_county "city/balance/total" . rename births__total births . label var births "births by residence" . rename births_of_residents_of_area__att births_h_p . label var births_h "births by residence: physician (in hospital)" . rename births_of_residents_of_area__at0 births_nh_p . label var births_nh_p "births by residence: physician (not in hospital)" . rename births_of_residents_of_area__at1 births_m . label var births_m "births by residence: midwife" . rename births_of_residents_of_area__at2 births_o . label var births_o "births by residence: other and not specified" . . *generate year variable . gen year=1940 . label var year "year" . . *check that observations are unique . egen tag=tag(state county sub_county) . assert tag==1 . drop tag . . list in 1/1 +-----------------------------------------------------------------------+ 1. | page_o~f | state | county | sub_co~y | births | bir~_h_p | bir~nh_p | | 119 | alabama | total | total | 62938 | 12971 | 29465 | |-------------------------------------------------+---------------------| | births_m | births_o | year | | 20038 | 464 | 1940 | +-----------------------------------------------------------------------+ . desc Contains data from natality1940.dta obs: 4,885 vars: 10 size: 605,740 (94.2% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page_of_pdf int %8.0g page of pdf state str20 %20s state county str52 %52s county sub_county str26 %26s city/balance/total births long %12.0g births by residence births_h_p long %12.0g births by residence: physician (in hospital) births_nh_p long %12.0g births by residence: physician (not in hospital) births_m int %8.0g births by residence: midwife births_o int %8.0g births by residence: other and not specified year float %9.0g year ------------------------------------------------------------------------------ > - Sorted by: page_of_pdf Note: dataset has changed since last saved . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page_of_pdf | 4885 146.4178 16.12847 119 174 state | 0 county | 0 sub_county | 0 births | 4885 1234.463 6790.606 0 196088 -------------+-------------------------------------------------------- births_h_p | 4885 723.1417 4673.697 0 170351 births_nh_p | 4885 410.1533 2256.015 0 62216 births_m | 4885 94.80532 837.8477 0 25984 births_o | 4885 6.346366 59.58772 0 3357 year | 4885 1940 0 1940 1940 . saveold clean_natality1940.dta, replace file clean_natality1940.dta saved . clear . . ** . *1941 data . ** . *http://cdc.gov/nchs/data/vsus/vsus_1941_2.pdf . *table 7 . . *hand-checked excel file for ?'s (difficult-to-read characters) . *corrected below if any . . clear . use natality1941.dta . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page__of_p~_ | 4885 152.6072 16.23427 125 181 state | 0 county | 0 city_balan~l | 0 births__to~l | 4882 1322.107 7297.707 1 210303 -------------+-------------------------------------------------------- births_of_~t | 4881 845.4882 5358.74 1 188386 births_of_~0 | 4870 377.2694 2096.331 1 55209 births_of_~1 | 2770 167.1588 1116.975 1 26118 births_of_~2 | 2354 11.6181 73.86958 1 2664 . desc Contains data from natality1941.dta obs: 4,885 vars: 9 size: 586,200 (94.4% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page__of_pdf_ int %8.0g state str20 %20s county str52 %52s city_balance_~l str26 %26s births__total long %12.0g births_of_res~t long %12.0g births_of_res~0 long %12.0g births_of_res~1 int %8.0g births_of_res~2 int %8.0g ------------------------------------------------------------------------------ > - Sorted by: . replace county=lower(county) (4885 real changes made) . replace state=lower(state) (4885 real changes made) . replace city_balance_total=lower(city_balance_total) (4885 real changes made) . . *missing observations in the original data are zeros . *replace these so we can distinguish these from "true" missing values . *which would be generated below due to difficult-to-read characters . count if births__total==. 3 . replace births__total=0 if births__total==. (3 real changes made) . count if births_of_residents_of_area__att==. 4 . replace births_of_residents_of_area__att=0 if births_of_residents_of_area__a > tt==. (4 real changes made) . count if births_of_residents_of_area__at0==. 15 . replace births_of_residents_of_area__at0=0 if births_of_residents_of_area__a > t0==. (15 real changes made) . count if births_of_residents_of_area__at1==. 2115 . replace births_of_residents_of_area__at1=0 if births_of_residents_of_area__a > t1==. (2115 real changes made) . count if births_of_residents_of_area__at2==. 2531 . replace births_of_residents_of_area__at2=0 if births_of_residents_of_area__a > t2==. (2531 real changes made) . . *check that all pdf pages appear to be in the data . sort page__of_pdf_ . gen temp=page__of_pdf_[_n]-page__of_pdf_[_n-1] (1 missing value generated) . assert temp==0|temp==1|temp==. . drop temp . . *clean data entry errors . replace births_of_residents_of_area__att=185 if state=="georgia" & county==" > chatham" & city_balance_total=="outside city" (1 real change made) . . *fix county name misspellings (relative to county names in the 1970 census) . replace county=subinstr(county, ".", "",.) (53 real changes made) . replace county=subinstr(county, "ste ", "st ",.) (1 real change made) . replace county="desoto" if county=="de soto" & state=="florida" (1 real change made) . replace county="cochrane" if county=="cochran" & state=="texas" (1 real change made) . replace county="davie" if county=="davis" & state=="north carolina" (1 real change made) . replace county="san francisco" if county=="san francisco, coextensive with s > an francisco (city)" & state=="california" (1 real change made) . replace county="orleans" if county=="orleans, coextensive with new orleans" > & state=="louisiana" (1 real change made) . replace county="denver" if county=="denver, coextensive with denver (city)" > & state=="colorado" (1 real change made) . replace county="philadelphia" if county=="philadelphia, coextensive with phi > ladelphia (city)" & state=="pennsylvania" (1 real change made) . replace county="baltimore city" if county=="baltimore (city)" & state=="mary > land" (1 real change made) . replace county="st louis city" if county=="st louis (city)" & state=="missou > ri" (1 real change made) . . *check that county names are consistent with 1970 census, except for known d > eviations . preserve . egen countyr_state=concat(county state), punct("_") . sort countyr_state . merge countyr_state using original_census1970_counties variable countyr_state does not uniquely identify observations in the master data variable countyr_state does not uniquely identify observations in original_census1970_counties.dta . assert _m==3|county=="armstrong" & state=="south dakota"|county=="washington > " & state=="south dakota"|county=="yellowstone national park (part)" & state > =="idaho"|county=="yellowstone national park" & state=="idaho"|county=="yell > owstone national park (part)" & state=="montana"|county=="yellowstone nation > al park (part)" & state=="wyoming"|county=="yellowstone national park (total > )" & state=="wyoming"|county=="yellowstone national park" & state=="montana" > |county=="park (excl yell nat park)" & state=="montana"|county=="yellowstone > national park, part" & state=="montana"|county=="yellowstone national park, > part" & state=="wyoming"|county=="yellowstone national park, part" & state= > ="montana"|county=="yellowstone national park, total" & state=="wyoming"|cou > nty=="park" & state=="montana"|county=="new york city" & state=="new york"|c > ounty=="bronx" & state=="new york"|county=="kings" & state=="new york"|count > y=="new york" & state=="new york"|county=="queens" & state=="new york"|count > y=="richmond" & state=="new york"|county=="ormsby" & state=="nevada"|county= > ="carson city" & state=="nevada"|county=="los alamos" & state=="new mexico"| > county=="menominee" & state=="wisconsin"|county=="total"|state=="alaska"|sta > te=="hawaii"|state=="virginia"|state=="dc" . restore . . *check for mistakes/misspellings in state names . assert state=="alabama"|state=="alaska"|state=="arizona"|state=="arkansas"|s > tate=="california"|state=="colorado"|state=="connecticut"|state=="delaware"| > state=="district of columbia"|state=="florida"|state=="georgia"|state=="hawa > ii"|state=="idaho"|state=="illinois"|state=="indiana"|state=="iowa"|state==" > kansas"|state=="kentucky"|state=="louisiana"|state=="maine"|state=="maryland > "|state=="massachusetts"|state=="michigan"|state=="minnesota"|state=="missis > sippi"|state=="missouri"|state=="montana"|state=="nebraska"|state=="nevada"| > state=="new hampshire"|state=="new jersey"|state=="new mexico"|state=="new y > ork"|state=="north carolina"|state=="north dakota"|state=="ohio"|state=="okl > ahoma"|state=="oregon"|state=="pennsylvania"|state=="rhode island"|state=="s > outh carolina"|state=="south dakota"|state=="tennessee"|state=="texas"|state > =="utah"|state=="vermont"|state=="virginia"|state=="washington"|state=="west > virginia"|state=="wisconsin"|state=="wyoming" . . *data checks for columns summing to county total . gen temp=births_of_residents_of_area__att+births_of_residents_of_area__at0+b > irths_of_residents_of_area__at1+births_of_residents_of_area__at2 . list if temp!=births__total +----------------------------------------------------------------+ 3306. | page__~_ | state | county | city_b~l | births~l | births~t | | 163 | ohio | montgomery | total | 6039 | 5081 | |----------------------------------------------------------------| | births~0 | births~1 | births~2 | temp | | 958 | 0 | 2 | 6041 | +----------------------------------------------------------------+ . *checked .pdf, these are data errors not data entry errors . drop temp . . *clean and label variables . rename page__of_pdf_ page_of_pdf . label var page_of_pdf "page of pdf" . label var state "state" . label var county "county" . rename city_balance_total sub_county . label var sub_county "city/balance/total" . rename births__total births . label var births "births by residence" . rename births_of_residents_of_area__att births_h_p . label var births_h "births by residence: physician (in hospital)" . rename births_of_residents_of_area__at0 births_nh_p . label var births_nh_p "births by residence: physician (not in hospital)" . rename births_of_residents_of_area__at1 births_m . label var births_m "births by residence: midwife" . rename births_of_residents_of_area__at2 births_o . label var births_o "births by residence: other and not specified" . . *generate year variable . gen year=1941 . label var year "year" . . *check that observations are unique . egen tag=tag(state county sub_county) . assert tag==1 . drop tag . . list in 1/1 +-----------------------------------------------------------------------+ 1. | page_o~f | state | county | sub_co~y | births | bir~_h_p | bir~nh_p | | 125 | alabama | total | total | 64379 | 15984 | 28386 | |-------------------------------------------------+---------------------| | births_m | births_o | year | | 19590 | 419 | 1941 | +-----------------------------------------------------------------------+ . desc Contains data from natality1941.dta obs: 4,885 vars: 10 size: 605,740 (94.2% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page_of_pdf int %8.0g page of pdf state str20 %20s state county str52 %52s county sub_county str26 %26s city/balance/total births long %12.0g births by residence births_h_p long %12.0g births by residence: physician (in hospital) births_nh_p long %12.0g births by residence: physician (not in hospital) births_m int %8.0g births by residence: midwife births_o int %8.0g births by residence: other and not specified year float %9.0g year ------------------------------------------------------------------------------ > - Sorted by: page_of_pdf Note: dataset has changed since last saved . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page_of_pdf | 4885 152.6072 16.23427 125 181 state | 0 county | 0 sub_county | 0 births | 4885 1321.295 7295.539 0 210303 -------------+-------------------------------------------------------- births_h_p | 4885 844.8 5356.599 0 188386 births_nh_p | 4885 376.111 2093.213 0 55209 births_m | 4885 94.78608 845.1099 0 26118 births_o | 4885 5.598567 51.60066 0 2664 year | 4885 1941 0 1941 1941 . saveold clean_natality1941.dta, replace file clean_natality1941.dta saved . clear . . ** . *1942 data . ** . *http://nber15.nber.org/vital-stats-books/vsus_1942_2.cv.pdf . *table 11 (section b, counties) . . *hand-checked excel file for ?'s (difficult-to-read characters) . *corrected below if any . . clear . use natality1942.dta . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page__of_p~_ | 4885 159.6565 7.851608 146 173 state | 0 county | 0 city_balan~l | 0 births__to~l | 4883 1490.05 8324.222 1 244802 -------------+-------------------------------------------------------- births_of_~t | 4883 1049.81 6587.745 1 225623 births_of_~0 | 4864 344.2632 1938.921 1 50997 births_of_~1 | 2653 168.4757 1100.394 1 26045 births_of_~2 | 2342 12.0538 74.82616 1 2661 . desc Contains data from natality1942.dta obs: 4,885 vars: 9 size: 586,200 (94.4% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page__of_pdf_ int %8.0g state str20 %20s county str52 %52s city_balance_~l str26 %26s births__total long %12.0g births_of_res~t long %12.0g births_of_res~0 long %12.0g births_of_res~1 int %8.0g births_of_res~2 int %8.0g ------------------------------------------------------------------------------ > - Sorted by: . replace county=lower(county) (4885 real changes made) . replace state=lower(state) (4885 real changes made) . replace city_balance_total=lower(city_balance_total) (4885 real changes made) . . *missing observations in the original data are zeros . *replace these so we can distinguish these from "true" missing values . *which would be generated below due to difficult-to-read characters . count if births__total==. 2 . replace births__total=0 if births__total==. (2 real changes made) . count if births_of_residents_of_area__att==. 2 . replace births_of_residents_of_area__att=0 if births_of_residents_of_area__a > tt==. (2 real changes made) . count if births_of_residents_of_area__at0==. 21 . replace births_of_residents_of_area__at0=0 if births_of_residents_of_area__a > t0==. (21 real changes made) . count if births_of_residents_of_area__at1==. 2232 . replace births_of_residents_of_area__at1=0 if births_of_residents_of_area__a > t1==. (2232 real changes made) . count if births_of_residents_of_area__at2==. 2543 . replace births_of_residents_of_area__at2=0 if births_of_residents_of_area__a > t2==. (2543 real changes made) . . *check that all pdf pages appear to be in the data . sort page__of_pdf_ . gen temp=page__of_pdf_[_n]-page__of_pdf_[_n-1] (1 missing value generated) . assert temp==0|temp==1|temp==. . drop temp . . *clean data entry errors . replace births__total=354 if state=="texas" & county=="nolan" & city_balance > _total=="total" (1 real change made) . replace births_of_residents_of_area__at0=86 if state=="texas" & county=="nol > an" & city_balance_total=="total" (1 real change made) . . *fix county name misspellings (relative to county names in the 1970 census) . replace county=subinstr(county, ".", "",.) (53 real changes made) . replace county=subinstr(county, "ste ", "st ",.) (1 real change made) . replace county="desoto" if county=="de soto" & state=="florida" (1 real change made) . replace county="cochrane" if county=="cochran" & state=="texas" (1 real change made) . replace county="davie" if county=="davis" & state=="north carolina" (1 real change made) . replace county="san francisco" if county=="san francisco, coextensive with s > an francisco (city)" & state=="california" (1 real change made) . replace county="orleans" if county=="orleans, coextensive with new orleans" > & state=="louisiana" (1 real change made) . replace county="denver" if county=="denver, coextensive with denver (city)" > & state=="colorado" (1 real change made) . replace county="philadelphia" if county=="philadelphia, coextensive with phi > ladelphia (city)" & state=="pennsylvania" (1 real change made) . replace county="baltimore city" if county=="baltimore (city)" & state=="mary > land" (1 real change made) . replace county="st louis city" if county=="st louis (city)" & state=="missou > ri" (1 real change made) . replace county="cambria" if county=="cambri's" & state=="pennsylvania" (3 real changes made) . replace county="pawnee" if county=="pawnac" & state=="oklahoma" (1 real change made) . . *check that county names are consistent with 1970 census, except for known d > eviations . preserve . egen countyr_state=concat(county state), punct("_") . sort countyr_state . merge countyr_state using original_census1970_counties variable countyr_state does not uniquely identify observations in the master data variable countyr_state does not uniquely identify observations in original_census1970_counties.dta . assert _m==3|county=="armstrong" & state=="south dakota"|county=="washington > " & state=="south dakota"|county=="yellowstone national park (part)" & state > =="idaho"|county=="yellowstone national park" & state=="idaho"|county=="yell > owstone national park (part)" & state=="montana"|county=="yellowstone nation > al park (part)" & state=="wyoming"|county=="yellowstone national park (total > )" & state=="wyoming"|county=="yellowstone national park" & state=="montana" > |county=="park (excl yell nat park)" & state=="montana"|county=="yellowstone > national park, part" & state=="montana"|county=="yellowstone national park, > part" & state=="wyoming"|county=="yellowstone national park, part" & state= > ="montana"|county=="yellowstone national park, total" & state=="wyoming"|cou > nty=="park" & state=="montana"|county=="new york city" & state=="new york"|c > ounty=="bronx" & state=="new york"|county=="kings" & state=="new york"|count > y=="new york" & state=="new york"|county=="queens" & state=="new york"|count > y=="richmond" & state=="new york"|county=="ormsby" & state=="nevada"|county= > ="carson city" & state=="nevada"|county=="los alamos" & state=="new mexico"| > county=="menominee" & state=="wisconsin"|county=="total"|state=="alaska"|sta > te=="hawaii"|state=="virginia"|state=="dc" . restore . . *check for mistakes/misspellings in state names . assert state=="alabama"|state=="alaska"|state=="arizona"|state=="arkansas"|s > tate=="california"|state=="colorado"|state=="connecticut"|state=="delaware"| > state=="district of columbia"|state=="florida"|state=="georgia"|state=="hawa > ii"|state=="idaho"|state=="illinois"|state=="indiana"|state=="iowa"|state==" > kansas"|state=="kentucky"|state=="louisiana"|state=="maine"|state=="maryland > "|state=="massachusetts"|state=="michigan"|state=="minnesota"|state=="missis > sippi"|state=="missouri"|state=="montana"|state=="nebraska"|state=="nevada"| > state=="new hampshire"|state=="new jersey"|state=="new mexico"|state=="new y > ork"|state=="north carolina"|state=="north dakota"|state=="ohio"|state=="okl > ahoma"|state=="oregon"|state=="pennsylvania"|state=="rhode island"|state=="s > outh carolina"|state=="south dakota"|state=="tennessee"|state=="texas"|state > =="utah"|state=="vermont"|state=="virginia"|state=="washington"|state=="west > virginia"|state=="wisconsin"|state=="wyoming" . . *data checks for columns summing to county total . gen temp=births_of_residents_of_area__att+births_of_residents_of_area__at0+b > irths_of_residents_of_area__at1+births_of_residents_of_area__at2 . list if temp!=births__total . *no errors . drop temp . . *clean and label variables . rename page__of_pdf_ page_of_pdf . label var page_of_pdf "page of pdf" . label var state "state" . label var county "county" . rename city_balance_total sub_county . label var sub_county "city/balance/total" . rename births__total births . label var births "births by residence" . rename births_of_residents_of_area__att births_h_p . label var births_h "births by residence: physician (in hospital)" . rename births_of_residents_of_area__at0 births_nh_p . label var births_nh_p "births by residence: physician (not in hospital)" . rename births_of_residents_of_area__at1 births_m . label var births_m "births by residence: midwife" . rename births_of_residents_of_area__at2 births_o . label var births_o "births by residence: other and not specified" . . *generate year variable . gen year=1942 . label var year "year" . . *check that observations are unique . egen tag=tag(state county sub_county) . assert tag==1 . drop tag . . list in 1/1 +-----------------------------------------------------------------------+ 1. | page_o~f | state | county | sub_co~y | births | bir~_h_p | bir~nh_p | | 146 | alabama | total | total | 71136 | 21220 | 30048 | |-------------------------------------------------+---------------------| | births_m | births_o | year | | 18511 | 1357 | 1942 | +-----------------------------------------------------------------------+ . desc Contains data from natality1942.dta obs: 4,885 vars: 10 size: 605,740 (94.2% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page_of_pdf int %8.0g page of pdf state str20 %20s state county str52 %52s county sub_county str26 %26s city/balance/total births long %12.0g births by residence births_h_p long %12.0g births by residence: physician (in hospital) births_nh_p long %12.0g births by residence: physician (not in hospital) births_m int %8.0g births by residence: midwife births_o int %8.0g births by residence: other and not specified year float %9.0g year ------------------------------------------------------------------------------ > - Sorted by: page_of_pdf Note: dataset has changed since last saved . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page_of_pdf | 4885 159.6565 7.851608 146 173 state | 0 county | 0 sub_county | 0 births | 4885 1489.444 8322.571 0 244802 -------------+-------------------------------------------------------- births_h_p | 4885 1049.38 6586.431 0 225623 births_nh_p | 4885 342.7873 1934.879 0 50997 births_m | 4885 91.49765 815.1944 0 26045 births_o | 4885 5.778915 52.15325 0 2661 year | 4885 1942 0 1942 1942 . saveold clean_natality1942.dta, replace file clean_natality1942.dta saved . clear . . ** . *1943 data . ** . *http://nber15.nber.org/vital-stats-books/vsus_1943_2.cv.pdf . *table 11 (section b, counties) . . *hand-checked excel file for ?'s (difficult-to-read characters) . *corrected below if any . . clear . use natality1943.dta . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page__of_p~_ | 4885 160.155 8.916113 0 174 state | 0 county | 0 city_balan~l | 0 births__to~l | 4881 1559.23 8636.618 3 248627 -------------+-------------------------------------------------------- births_of_~t | 4881 1160.669 7052.628 2 232907 births_of_~0 | 4851 304.6834 1751.711 1 43780 births_of_~1 | 2710 163.2162 1078.723 1 25480 births_of_~2 | 2262 11.07118 74.35113 1 3055 . desc Contains data from natality1943.dta obs: 4,885 vars: 9 size: 586,200 (94.4% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page__of_pdf_ int %8.0g state str20 %20s county str52 %52s city_balance_~l str26 %26s births__total long %12.0g births_of_res~t long %12.0g births_of_res~0 long %12.0g births_of_res~1 int %8.0g births_of_res~2 int %8.0g ------------------------------------------------------------------------------ > - Sorted by: . replace county=lower(county) (4885 real changes made) . replace state=lower(state) (4885 real changes made) . replace city_balance_total=lower(city_balance_total) (4885 real changes made) . . *missing observations in the original data are zeros . *replace these so we can distinguish these from "true" missing values . *which would be generated below due to difficult-to-read characters . count if births__total==. 4 . replace births__total=0 if births__total==. (4 real changes made) . count if births_of_residents_of_area__att==. 4 . replace births_of_residents_of_area__att=0 if births_of_residents_of_area__a > tt==. (4 real changes made) . count if births_of_residents_of_area__at0==. 34 . replace births_of_residents_of_area__at0=0 if births_of_residents_of_area__a > t0==. (34 real changes made) . count if births_of_residents_of_area__at1==. 2175 . replace births_of_residents_of_area__at1=0 if births_of_residents_of_area__a > t1==. (2175 real changes made) . count if births_of_residents_of_area__at2==. 2623 . replace births_of_residents_of_area__at2=0 if births_of_residents_of_area__a > t2==. (2623 real changes made) . . *check that all pdf pages appear to be in the data . replace page__of_pdf_=168 if state=="south carolina" & county=="total" (1 real change made) . replace page__of_pdf_=168 if state=="south carolina" & county=="edgefield" (1 real change made) . replace page__of_pdf_=169 if state=="texas" & county=="armstrong" (1 real change made) . sort page__of_pdf_ . gen temp=page__of_pdf_[_n]-page__of_pdf_[_n-1] (1 missing value generated) . assert temp==0|temp==1|temp==. . drop temp . . *clean data entry errors . replace state="louisiana" if state=="louislana" (83 real changes made) . . *fix county name misspellings (relative to county names in the 1970 census) . replace county=subinstr(county, ".", "",.) (53 real changes made) . replace county=subinstr(county, "ste ", "st ",.) (1 real change made) . replace county="desoto" if county=="de soto" & state=="florida" (1 real change made) . replace county="cochrane" if county=="cochran" & state=="texas" (1 real change made) . replace county="davie" if county=="davis" & state=="north carolina" (1 real change made) . replace county="san francisco" if county=="san francisco, coextensive with s > an francisco (city)" & state=="california" (1 real change made) . replace county="orleans" if county=="orleans, coextensive with new orleans" > & state=="louisiana" (1 real change made) . replace county="denver" if county=="denver, coextensive with denver (city)" > & state=="colorado" (1 real change made) . replace county="philadelphia" if county=="philadelphia, coextensive with phi > ladelphia (city)" & state=="pennsylvania" (1 real change made) . replace county="baltimore city" if county=="baltimore (city)" & state=="mary > land" (1 real change made) . replace county="st louis city" if county=="st louis (city)" & state=="missou > ri" (1 real change made) . . *check that county names are consistent with 1970 census, except for known d > eviations . preserve . egen countyr_state=concat(county state), punct("_") . sort countyr_state . merge countyr_state using original_census1970_counties variable countyr_state does not uniquely identify observations in the master data variable countyr_state does not uniquely identify observations in original_census1970_counties.dta . assert _m==3|county=="armstrong" & state=="south dakota"|county=="washington > " & state=="south dakota"|county=="yellowstone national park (part)" & state > =="idaho"|county=="yellowstone national park" & state=="idaho"|county=="yell > owstone national park (part)" & state=="montana"|county=="yellowstone nation > al park (part)" & state=="wyoming"|county=="yellowstone national park (total > )" & state=="wyoming"|county=="yellowstone national park" & state=="montana" > |county=="park (excl yell nat park)" & state=="montana"|county=="yellowstone > national park, part" & state=="montana"|county=="yellowstone national park, > part" & state=="wyoming"|county=="yellowstone national park, part" & state= > ="montana"|county=="yellowstone national park, total" & state=="wyoming"|cou > nty=="park" & state=="montana"|county=="new york city" & state=="new york"|c > ounty=="bronx" & state=="new york"|county=="kings" & state=="new york"|count > y=="new york" & state=="new york"|county=="queens" & state=="new york"|count > y=="richmond" & state=="new york"|county=="ormsby" & state=="nevada"|county= > ="carson city" & state=="nevada"|county=="los alamos" & state=="new mexico"| > county=="menominee" & state=="wisconsin"|county=="total"|state=="alaska"|sta > te=="hawaii"|state=="virginia"|state=="dc" . restore . . *check for mistakes/misspellings in state names . assert state=="alabama"|state=="alaska"|state=="arizona"|state=="arkansas"|s > tate=="california"|state=="colorado"|state=="connecticut"|state=="delaware"| > state=="district of columbia"|state=="florida"|state=="georgia"|state=="hawa > ii"|state=="idaho"|state=="illinois"|state=="indiana"|state=="iowa"|state==" > kansas"|state=="kentucky"|state=="louisiana"|state=="maine"|state=="maryland > "|state=="massachusetts"|state=="michigan"|state=="minnesota"|state=="missis > sippi"|state=="missouri"|state=="montana"|state=="nebraska"|state=="nevada"| > state=="new hampshire"|state=="new jersey"|state=="new mexico"|state=="new y > ork"|state=="north carolina"|state=="north dakota"|state=="ohio"|state=="okl > ahoma"|state=="oregon"|state=="pennsylvania"|state=="rhode island"|state=="s > outh carolina"|state=="south dakota"|state=="tennessee"|state=="texas"|state > =="utah"|state=="vermont"|state=="virginia"|state=="washington"|state=="west > virginia"|state=="wisconsin"|state=="wyoming" . . *data checks for columns summing to county total . gen temp=births_of_residents_of_area__att+births_of_residents_of_area__at0+b > irths_of_residents_of_area__at1+births_of_residents_of_area__at2 . list if temp!=births__total . *no errors . drop temp . . *clean and label variables . rename page__of_pdf_ page_of_pdf . label var page_of_pdf "page of pdf" . label var state "state" . label var county "county" . rename city_balance_total sub_county . label var sub_county "city/balance/total" . rename births__total births . label var births "births by residence" . rename births_of_residents_of_area__att births_h_p . label var births_h "births by residence: physician (in hospital)" . rename births_of_residents_of_area__at0 births_nh_p . label var births_nh_p "births by residence: physician (not in hospital)" . rename births_of_residents_of_area__at1 births_m . label var births_m "births by residence: midwife" . rename births_of_residents_of_area__at2 births_o . label var births_o "births by residence: other and not specified" . . *generate year variable . gen year=1943 . label var year "year" . . *check that observations are unique . egen tag=tag(state county sub_county) . assert tag==1 . drop tag . . list in 1/1 +-----------------------------------------------------------------------+ 1. | page_o~f | state | county | sub_co~y | births | bir~_h_p | bir~nh_p | | 146 | alabama | total | total | 77535 | 26733 | 31082 | |-------------------------------------------------+---------------------| | births_m | births_o | year | | 19353 | 367 | 1943 | +-----------------------------------------------------------------------+ . desc Contains data from natality1943.dta obs: 4,885 vars: 10 size: 605,740 (94.2% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page_of_pdf int %8.0g page of pdf state str20 %20s state county str52 %52s county sub_county str26 %26s city/balance/total births long %12.0g births by residence births_h_p long %12.0g births by residence: physician (in hospital) births_nh_p long %12.0g births by residence: physician (not in hospital) births_m int %8.0g births by residence: midwife births_o int %8.0g births by residence: other and not specified year float %9.0g year ------------------------------------------------------------------------------ > - Sorted by: page_of_pdf Note: dataset has changed since last saved . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page_of_pdf | 4885 160.2583 7.985757 146 174 state | 0 county | 0 sub_county | 0 births | 4885 1557.954 8633.196 0 248627 -------------+-------------------------------------------------------- births_h_p | 4885 1159.719 7049.818 0 232907 births_nh_p | 4885 302.5627 1745.787 0 43780 births_m | 4885 90.54575 807.476 0 25480 births_o | 4885 5.12651 50.88867 0 3055 year | 4885 1943 0 1943 1943 . saveold clean_natality1943.dta, replace file clean_natality1943.dta saved . clear . . ** . *1944 data . ** . *http://nber15.nber.org/vital-stats-books/vsus_1944_2.cv.pdf . *table 11 (section b, counties) . . *hand-checked excel file for ?'s (difficult-to-read characters) . *corrected below if any . . clear . use natality1944.dta . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page__of_p~_ | 4885 144.2583 7.985757 130 158 state | 0 county | 0 city_balan~l | 0 births__to~l | 4881 1483.01 8145.039 2 229534 -------------+-------------------------------------------------------- births_of_~t | 4881 1153.404 6825.157 1 218085 births_of_~0 | 4834 243.1752 1438.545 1 34276 births_of_~1 | 2604 157.5733 1023.425 1 23361 births_of_~2 | 2142 10.72876 63.50272 1 2355 . desc Contains data from natality1944.dta obs: 4,885 vars: 9 size: 586,200 (94.4% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page__of_pdf_ int %8.0g state str20 %20s county str52 %52s city_balance_~l str26 %26s births__total long %12.0g births_of_res~t long %12.0g births_of_res~0 long %12.0g births_of_res~1 int %8.0g births_of_res~2 int %8.0g ------------------------------------------------------------------------------ > - Sorted by: . replace county=lower(county) (4885 real changes made) . replace state=lower(state) (4885 real changes made) . replace city_balance_total=lower(city_balance_total) (4885 real changes made) . . *missing observations in the original data are zeros . *replace these so we can distinguish these from "true" missing values . *which would be generated below due to difficult-to-read characters . count if births__total==. 4 . replace births__total=0 if births__total==. (4 real changes made) . count if births_of_residents_of_area__att==. 4 . replace births_of_residents_of_area__att=0 if births_of_residents_of_area__a > tt==. (4 real changes made) . count if births_of_residents_of_area__at0==. 51 . replace births_of_residents_of_area__at0=0 if births_of_residents_of_area__a > t0==. (51 real changes made) . count if births_of_residents_of_area__at1==. 2281 . replace births_of_residents_of_area__at1=0 if births_of_residents_of_area__a > t1==. (2281 real changes made) . count if births_of_residents_of_area__at2==. 2743 . replace births_of_residents_of_area__at2=0 if births_of_residents_of_area__a > t2==. (2743 real changes made) . . *check that all pdf pages appear to be in the data . sort page__of_pdf_ . gen temp=page__of_pdf_[_n]-page__of_pdf_[_n-1] (1 missing value generated) . assert temp==0|temp==1|temp==. . drop temp . . *clean data entry errors . replace state="indiana" if state=="illinois" & county=="total" & births__tot > al==71354 (1 real change made) . replace state="indiana" if state=="illinois" & county=="adams" & births__tot > al==443 (1 real change made) . replace state="indiana" if state=="illinois" & county=="allen" & (births__to > tal==3503 | births__total==2716 | births__total==787) (3 real changes made) . replace state="indiana" if state=="illinois" & county=="bartholomew" & (birt > hs__total==681 | births__total==356 | births__total==325) (3 real changes made) . replace state="indiana" if state=="illinois" & county=="benton" & births__to > tal==213 (1 real change made) . replace state="indiana" if state=="illinois" & county=="blackford" & births_ > _total==230 (1 real change made) . replace state="indiana" if state=="illinois" & county=="boone" & births__tot > al==450 (1 real change made) . replace state="indiana" if state=="illinois" & county=="brown" & births__tot > al==91 (1 real change made) . replace state="indiana" if state=="illinois" & county=="carroll" & births__t > otal==293 (1 real change made) . replace state="indiana" if state=="illinois" & county=="cass" & (births__tot > al==683 | births__total==405 | births__total==278) (3 real changes made) . replace state="indiana" if state=="illinois" & county=="clark" & (births__to > tal==894 | births__total==367 | births__total==527) (3 real changes made) . replace state="indiana" if state=="illinois" & county=="clay" & births__tota > l==403 (1 real change made) . replace state="indiana" if state=="illinois" & county=="clinton" & (births__ > total==550 | births__total==276 | births__total==274) (3 real changes made) . replace state="indiana" if state=="illinois" & county=="crawford" & births__ > total==164 (1 real change made) . replace state="indiana" if state=="illinois" & county=="daviess" & births__t > otal==582 (1 real change made) . replace state="indiana" if state=="illinois" & county=="dearborn" & births__ > total==414 (1 real change made) . replace state="indiana" if state=="illinois" & county=="decatur" & births__t > otal==358 (1 real change made) . replace state="indiana" if state=="illinois" & county=="de kalb" & births__t > otal==501 (1 real change made) . replace state="iowa" if state=="indiana" & county=="total" & births__total== > 46564 (1 real change made) . replace state="iowa" if state=="indiana" & county=="adair" & births__total== > 235 (1 real change made) . replace state="iowa" if state=="indiana" & county=="adams" & births__total== > 172 (1 real change made) . replace state="iowa" if state=="indiana" & county=="allamakee" & births__tot > al==309 (1 real change made) . replace state="iowa" if state=="indiana" & county=="appanoose" & births__tot > al==402 (1 real change made) . replace state="iowa" if state=="indiana" & county=="audubon" & births__total > ==214 (1 real change made) . replace state="iowa" if state=="indiana" & county=="benton" & births__total= > =397 (1 real change made) . replace state="iowa" if state=="indiana" & county=="black hawk" & (births__t > otal==1634 | births__total==1053 | births__total==581) (3 real changes made) . replace state="iowa" if state=="indiana" & county=="boone" & (births__total= > =456 | births__total==198 | births__total==258) (3 real changes made) . replace state="iowa" if state=="indiana" & county=="bremer" & births__total= > =310 (1 real change made) . replace state="iowa" if state=="indiana" & county=="buchanan" & births__tota > l==376 (1 real change made) . replace state="iowa" if state=="indiana" & county=="buena vista" & births__t > otal==324 (1 real change made) . replace state="iowa" if state=="indiana" & county=="butler" & births__total= > =285 (1 real change made) . replace state="iowa" if state=="indiana" & county=="calhoun" & births__total > ==342 (1 real change made) . replace state="iowa" if state=="indiana" & county=="carroll" & births__total > ==487 (1 real change made) . replace state="iowa" if state=="indiana" & county=="cass" & births__total==3 > 66 (1 real change made) . replace state="iowa" if state=="indiana" & county=="cedar" & births__total== > 278 (1 real change made) . replace state="iowa" if state=="indiana" & county=="cerro gordo" & (births__ > total==763 | births__total==461 | births__total==302) (3 real changes made) . replace state="iowa" if state=="indiana" & county=="cherokee" & births__tota > l==308 (1 real change made) . replace state="iowa" if state=="indiana" & county=="chickasaw" & births__tot > al==274 (1 real change made) . replace state="iowa" if state=="indiana" & county=="clarke" & births__total= > =148 (1 real change made) . replace state="iowa" if state=="indiana" & county=="clay" & births__total==3 > 59 (1 real change made) . replace state="iowa" if state=="indiana" & county=="clayton" & births__total > ==398 (1 real change made) . replace state="iowa" if state=="indiana" & county=="clinton" & (births__tota > l==861 | births__total==529 | births__total==332) (3 real changes made) . replace state="iowa" if state=="indiana" & county=="crawford" & births__tota > l==380 (1 real change made) . replace state="iowa" if state=="indiana" & county=="dallas" & births__total= > =412 (1 real change made) . replace state="iowa" if state=="indiana" & county=="davis" & births__total== > 164 (1 real change made) . replace state="iowa" if state=="indiana" & county=="decatur" & births__total > ==249 (1 real change made) . replace state="iowa" if state=="indiana" & county=="delaware" & births__tota > l==350 (1 real change made) . replace state="iowa" if state=="indiana" & county=="des moines" & (births__t > otal==786 | births__total==591 | births__total==195) (3 real changes made) . replace state="iowa" if state=="indiana" & county=="dickinson" & births__tot > al==191 (1 real change made) . replace state="iowa" if state=="indiana" & county=="dubuque" & (births__tota > l==1245 | births__total==820 | births__total==425) (3 real changes made) . replace state="iowa" if state=="indiana" & county=="emmet" & births__total== > 282 (1 real change made) . replace state="iowa" if state=="indiana" & county=="fayette" & births__total > ==494 (1 real change made) . replace state="iowa" if state=="indiana" & county=="floyd" & births__total== > 420 (1 real change made) . replace state="iowa" if state=="indiana" & county=="franklin" & births__tota > l==302 (1 real change made) . replace state="iowa" if state=="indiana" & county=="fremont" & births__total > ==244 (1 real change made) . replace state="iowa" if state=="indiana" & county=="greene" & births__total= > =258 (1 real change made) . replace state="iowa" if state=="indiana" & county=="grundy" & births__total= > =226 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="total" & > births__total==51467 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="abbeville > " & births__total==425 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="aiken" & > births__total==1293 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="allendale > " & births__total==338 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="anderson" > & (births__total==1935 | births__total==645 | births__total==1290) (3 real changes made) . replace state="south carolina" if state=="pennsylvania" & county=="bamberg" > & births__total==428 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="barnwell" > & births__total==518 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="beaufort" > & births__total==739 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="berkeley" > & births__total==845 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="calhoun" > & births__total==412 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="charlesto > n" & (births__total==5392 | births__total==2621 | births__total==2771) (3 real changes made) . replace state="south carolina" if state=="pennsylvania" & county=="cherokee" > & births__total==786 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="chester" > & births__total==735 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="chesterfi > eld" & births__total==952 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="clarendon > " & births__total==784 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="colleton" > & births__total==791 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="darlingto > n" & births__total==1258 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="dillon" & > births__total==819 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="dorcheste > r" & births__total==684 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="edgefield > " & births__total==418 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="fairfield > " & births__total==590 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="florence" > & (births__total==2064 | births__total==652 | births__total==1412) (3 real changes made) . replace state="south carolina" if state=="pennsylvania" & county=="georgetow > n" & births__total==840 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="greenvill > e" & (births__total==3382 | births__total==1483 | births__total==1899) (3 real changes made) . replace state="south carolina" if state=="pennsylvania" & county=="greenwood > " & (births__total==884 | births__total==354 | births__total==530) (3 real changes made) . replace state="south carolina" if state=="pennsylvania" & county=="hampton" > & births__total==506 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="horry" & > births__total==1513 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="jasper" & > births__total==331 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="kershaw" > & births__total==784 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="lancaster > " & births__total==801 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="laurens" > & births__total==775 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="lee" & bi > rths__total==666 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="lexington > " & births__total==957 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="mccormick > " & births__total==269 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="marion" & > births__total==849 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="marlboro" > & births__total==891 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="newberry" > & births__total==765 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="oconee" & > births__total==831 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="orangebur > g" & (births__total==1917 | births__total==373 | births__total==1544) (3 real changes made) . replace state="south carolina" if state=="pennsylvania" & county=="pickens" > & births__total==864 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="richland" > & (births__total==3162 | births__total==2175 | births__total==987) (3 real changes made) . replace state="south carolina" if state=="pennsylvania" & county=="saluda" & > births__total==309 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="spartanbu > rg" & (births__total==3057 | births__total==929 | births__total==2128) (3 real changes made) . replace state="south carolina" if state=="pennsylvania" & county=="sumter" & > (births__total==1427 | births__total==466 | births__total==961) (3 real changes made) . replace state="south carolina" if state=="pennsylvania" & county=="union" & > births__total==690 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="williamsb > urg" & births__total==1265 (1 real change made) . replace state="south carolina" if state=="pennsylvania" & county=="york" & ( > births__total==1526 | births__total==474 | births__total==1052) (3 real changes made) . replace state="tennessee" if state=="south dakota" & county=="total" & birth > s__total==68272 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="anderson" & bi > rths__total==862 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="bedford" & bir > ths__total==515 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="benton" & birt > hs__total==230 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="bledsoe" & bir > ths__total==220 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="blount" & birt > hs__total==1349 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="bradley" & (bi > rths__total==737 | births__total==323 | births__total==414) (3 real changes made) . replace state="tennessee" if state=="south dakota" & county=="campbell" & bi > rths__total==999 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="cannon" & birt > hs__total==236 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="carroll" & bir > ths__total==565 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="carter" & birt > hs__total==906 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="cheatham" & bi > rths__total==200 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="chester" & bir > ths__total==237 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="claiborne" & b > irths__total==623 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="clay" & births > __total==208 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="cocke" & birth > s__total==609 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="coffee" & birt > hs__total==539 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="crockett" & bi > rths__total==385 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="cumberland" & > births__total==466 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="davidson" & (b > irths__total==5543 | births__total==4411 | births__total==1132) (3 real changes made) . replace state="tennessee" if state=="south dakota" & county=="decatur" & bir > ths__total==239 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="de kalb" & bir > ths__total==240 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="dickson" & bir > ths__total==448 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="dyer" & (birth > s__total==801 | births__total==215 | births__total==586) (3 real changes made) . replace state="tennessee" if state=="south dakota" & county=="fayette" & bir > ths__total==793 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="fentress" & bi > rths__total==426 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="franklin" & bi > rths__total==595 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="gibson" & birt > hs__total==1074 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="giles" & birth > s__total==609 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="grainger" & bi > rths__total==294 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="greene" & birt > hs__total==834 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="grundy" & birt > hs__total==349 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="hamblen" & bir > ths__total==466 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="hamilton" & (b > irths__total==4247 | births__total==3219 | births__total==1028) (3 real changes made) . replace state="tennessee" if state=="south dakota" & county=="hancock" & bir > ths__total==228 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="hardeman" & bi > rths__total==540 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="hardin" & birt > hs__total==380 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="hawkins" & bir > ths__total==726 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="haywood" & bir > ths__total==661 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="henderson" & b > irths__total==389 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="henry" & birth > s__total==485 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="hickman" & bir > ths__total==272 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="houston" & bir > ths__total==129 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="humphreys" & b > irths__total==236 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="jackson" & bir > ths__total==346 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="jefferson" & b > irths__total==426 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="johnson" & bir > ths__total==288 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="knox" & (birth > s__total==4359 | births__total==2794 | births__total==1565) (3 real changes made) . replace state="tennessee" if state=="south dakota" & county=="lake" & births > __total==313 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="lauderdale" & > births__total==609 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="lawrence" & bi > rths__total==765 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="lewis" & birth > s__total==125 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="lincoln" & bir > ths__total==563 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="loudon" & birt > hs__total==548 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="mcminn" & birt > hs__total==745 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="mcnairy" & bir > ths__total==455 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="macon" & birth > s__total==289 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="madison" & (bi > rths__total==1255 | births__total==587 | births__total==668) (3 real changes made) . replace state="tennessee" if state=="south dakota" & county=="marion" & birt > hs__total==530 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="marshall" & bi > rths__total==357 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="maury" & (birt > hs__total==831 | births__total==231 | births__total==600) (3 real changes made) . replace state="tennessee" if state=="south dakota" & county=="meigs" & birth > s__total==148 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="monroe" & birt > hs__total==672 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="montgomery" & > (births__total==701 | births__total==259 | births__total==442) (3 real changes made) . replace state="tennessee" if state=="south dakota" & county=="moore" & birth > s__total==71 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="morgan" & birt > hs__total==384 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="obion" & birth > s__total==556 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="overton" & bir > ths__total==397 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="perry" & birth > s__total==155 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="pickett" & bir > ths__total==104 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="polk" & births > __total==380 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="putnam" & birt > hs__total==565 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="rhea" & births > __total==421 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="roane" & birth > s__total==859 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="robertson" & b > irths__total==561 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="rutherford" & > births__total==785 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="scott" & birth > s__total==457 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="sequatchie" & > births__total==151 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="sevier" & birt > hs__total==610 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="shelby" & (bir > ths__total==8103 | births__total==6254 | births__total==1849) (3 real changes made) . replace state="tennessee" if state=="south dakota" & county=="smith" & birth > s__total==251 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="stewart" & bir > ths__total==222 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="sullivan" & (b > irths__total==2058 | births__total==359 | births__total==689 | births__total > ==1010) (4 real changes made) . replace state="tennessee" if state=="south dakota" & county=="sumner" & birt > hs__total==595 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="tipton" & birt > hs__total==789 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="trousdale" & b > irths__total==132 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="unicoi" & birt > hs__total==379 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="union" & birth > s__total==178 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="van buren" & b > irths__total==91 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="warren" & birt > hs__total==455 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="washington" & > (births__total==1178 | births__total==504 | births__total==874) (3 real changes made) . replace state="tennessee" if state=="south dakota" & county=="wayne" & birth > s__total==316 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="weakley" & bir > ths__total==472 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="white" & birth > s__total==411 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="williamson" & > births__total==491 (1 real change made) . replace state="tennessee" if state=="south dakota" & county=="wilson" & birt > hs__total==480 (1 real change made) . replace state="texas" if state=="south dakota" & county=="total" & births__t > otal==165900 (1 real change made) . replace state="texas" if state=="south dakota" & county=="anderson" & (birth > s__total==664 | births__total==272 | births__total==392) (3 real changes made) . replace state="texas" if state=="south dakota" & county=="andrews" & births_ > _total==34 (1 real change made) . replace state="texas" if state=="south dakota" & county=="angelina" & births > __total==739 (1 real change made) . replace state="texas" if state=="south dakota" & county=="aransas" & births_ > _total==74 (1 real change made) . replace state="texas" if state=="south dakota" & county=="archer" & births__ > total==118 (1 real change made) . replace state="texas" if state=="south dakota" & county=="armstrong" & birth > s__total==51 (1 real change made) . replace state="texas" if state=="south dakota" & county=="atascosa" & births > __total==607 (1 real change made) . replace state="texas" if state=="south dakota" & county=="austin" & births__ > total==282 (1 real change made) . replace state="texas" if state=="south dakota" & county=="bailey" & births__ > total==188 (1 real change made) . replace state="texas" if state=="south dakota" & county=="bandera" & births_ > _total==46 (1 real change made) . replace state="texas" if state=="south dakota" & county=="bastrop" & births_ > _total==575 (1 real change made) . replace state="texas" if state=="south dakota" & county=="baylor" & births__ > total==147 (1 real change made) . replace state="texas" if state=="south dakota" & county=="bee" & births__tot > al==457 (1 real change made) . replace state="texas" if state=="south dakota" & county=="bell" & (births__t > otal==1400 | births__total==551 | births__total==849) (3 real changes made) . replace state="texas" if state=="south dakota" & county=="bexar" & (births__ > total==11515 | births__total==10064 | births__total==1451) (3 real changes made) . replace state="texas" if state=="south dakota" & county=="blanco" & births__ > total==74 (1 real change made) . replace state="texas" if state=="south dakota" & county=="borden" & births__ > total==12 (1 real change made) . replace state="texas" if state=="south dakota" & county=="bosque" & births__ > total==248 (1 real change made) . replace state="rhode island" if state=="pennsylvania" & county=="total" & bi > rths__total==13754 (1 real change made) . replace state="rhode island" if state=="pennsylvania" & county=="bristol" & > (births__total==481 | births__total==212 | births__total==269) (3 real changes made) . replace state="rhode island" if state=="pennsylvania" & county=="kent" & (bi > rths__total==1290 | births__total==664 | births__total==359 | births__total= > =267) (4 real changes made) . replace state="rhode island" if state=="pennsylvania" & county=="newport" & > (births__total==1256 | births__total==792 | births__total==464) (3 real changes made) . replace state="rhode island" if state=="pennsylvania" & county=="providence" > & (births__total==9872 | births__total==461 | births__total==818 | births__ > total==178 | births__total==627 | births__total==211 | births__total==168 | > births__total==260 | births__total==1319| births__total==4510 | births__tota > l==894 | births__total==426) (12 real changes made) . replace state="rhode island" if state=="pennsylvania" & county=="washington" > & (births__total==855 | births__total==302 | births__total==553) (3 real changes made) . replace state="south dakota" if state=="pennsylvania" & county=="total" & bi > rths__total==12769 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="armstrong" > & births__total==0 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="aurora" & b > irths__total==96 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="beadle" & ( > births__total==370 | births__total==207 | births__total==163) (3 real changes made) . replace state="south dakota" if state=="pennsylvania" & county=="bennett" & > births__total==60 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="bon homme" > & births__total==148 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="brookings" > & births__total==304 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="brown" & (b > irths__total==590 | births__total==370 | births__total==220) (3 real changes made) . replace state="south dakota" if state=="pennsylvania" & county=="brule" & bi > rths__total==120 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="buffalo" & > births__total==64 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="butte" & bi > rths__total==168 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="campbell" & > births__total==91 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="charles mix > " & births__total==224 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="clark" & bi > rths__total==180 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="clay" & bir > ths__total==139 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="codington" > & (births__total==384 | births__total==237 | births__total==147) (3 real changes made) . replace state="south dakota" if state=="pennsylvania" & county=="corson" & b > irths__total==156 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="custer" & b > irths__total==84 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="davison" & > (births__total==280 | births__total==189 | births__total==91) (3 real changes made) . replace state="south dakota" if state=="pennsylvania" & county=="day" & birt > hs__total==299 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="deuel" & bi > rths__total==139 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="dewey" & bi > rths__total==117 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="douglas" & > births__total==139 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="edmunds" & > births__total==143 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="fall river" > & births__total==222 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="faulk" & bi > rths__total==84 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="grant" & bi > rths__total==213 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="gregory" & > births__total==199 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="haakon" & b > irths__total==83 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="hamlin" & b > irths__total==119 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="hand" & bir > ths__total==130 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="hanson" & b > irths__total==110 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="harding" & > births__total==69 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="hughes" & b > irths__total==149 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="hutchinson" > & births__total==226 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="hyde" & bir > ths__total==53 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="jackson" & > births__total==27 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="jerauld" & > births__total==85 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="jones" & bi > rths__total==50 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="kingsbury" > & births__total==209 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="lake" & bir > ths__total==245 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="lawrence" & > births__total==241 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="lincoln" & > births__total==233 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="lyman" & bi > rths__total==93 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="mccook" & b > irths__total==172 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="mcpherson" > & births__total==126 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="marshall" & > births__total==148 (1 real change made) . replace state="south dakota" if state=="pennsylvania" & county=="meade" & bi > rths__total==170 (1 real change made) . replace births_of_residents_of_area__att=328 if state=="tennessee" & county= > ="washington" & city_balance_total=="balance of county" & births_of_resident > s_of_area__att==528 (1 real change made) . replace births__total=674 if state=="tennessee" & county=="washington" & cit > y_balance_total=="balance of county" & births__total==874 (1 real change made) . . *fix county name misspellings (relative to county names in the 1970 census) . replace county=subinstr(county, ".", "",.) (53 real changes made) . replace county=subinstr(county, "ste ", "st ",.) (1 real change made) . replace county="desoto" if county=="de soto" & state=="florida" (1 real change made) . replace county="cochrane" if county=="cochran" & state=="texas" (1 real change made) . replace county="davie" if county=="davis" & state=="north carolina" (1 real change made) . replace county="san francisco" if county=="san francisco, coextensive with s > an francisco (city)" & state=="california" (1 real change made) . replace county="orleans" if county=="orleans, coextensive with new orleans" > & state=="louisiana" (1 real change made) . replace county="denver" if county=="denver, coextensive with denver (city)" > & state=="colorado" (1 real change made) . replace county="philadelphia" if county=="philadelphia, coextensive with phi > ladelphia (city)" & state=="pennsylvania" (1 real change made) . replace county="baltimore city" if county=="baltimore (city)" & state=="mary > land" (1 real change made) . replace county="st louis city" if county=="st louis (city)" & state=="missou > ri" (1 real change made) . . *correct data entry errors found while checking that county totals sum to st > ate totals . replace births__total=76265 if state=="new jersey" & county=="total" & city_ > balance_total=="total" & births__total==76285 (1 real change made) . . *check that county names are consistent with 1970 census, except for known d > eviations . preserve . egen countyr_state=concat(county state), punct("_") . sort countyr_state . merge countyr_state using original_census1970_counties variable countyr_state does not uniquely identify observations in the master data variable countyr_state does not uniquely identify observations in original_census1970_counties.dta . assert _m==3|county=="armstrong" & state=="south dakota"|county=="washington > " & state=="south dakota"|county=="yellowstone national park (part)" & state > =="idaho"|county=="yellowstone national park" & state=="idaho"|county=="yell > owstone national park (part)" & state=="montana"|county=="yellowstone nation > al park (part)" & state=="wyoming"|county=="yellowstone national park (total > )" & state=="wyoming"|county=="yellowstone national park" & state=="montana" > |county=="park (excl yell nat park)" & state=="montana"|county=="yellowstone > national park, part" & state=="montana"|county=="yellowstone national park, > part" & state=="wyoming"|county=="yellowstone national park, part" & state= > ="montana"|county=="yellowstone national park, total" & state=="wyoming"|cou > nty=="park" & state=="montana"|county=="new york city" & state=="new york"|c > ounty=="bronx" & state=="new york"|county=="kings" & state=="new york"|count > y=="new york" & state=="new york"|county=="queens" & state=="new york"|count > y=="richmond" & state=="new york"|county=="ormsby" & state=="nevada"|county= > ="carson city" & state=="nevada"|county=="los alamos" & state=="new mexico"| > county=="menominee" & state=="wisconsin"|county=="total"|state=="alaska"|sta > te=="hawaii"|state=="virginia"|state=="dc" . restore . . *check for mistakes/misspellings in state names . assert state=="alabama"|state=="alaska"|state=="arizona"|state=="arkansas"|s > tate=="california"|state=="colorado"|state=="connecticut"|state=="delaware"| > state=="district of columbia"|state=="florida"|state=="georgia"|state=="hawa > ii"|state=="idaho"|state=="illinois"|state=="indiana"|state=="iowa"|state==" > kansas"|state=="kentucky"|state=="louisiana"|state=="maine"|state=="maryland > "|state=="massachusetts"|state=="michigan"|state=="minnesota"|state=="missis > sippi"|state=="missouri"|state=="montana"|state=="nebraska"|state=="nevada"| > state=="new hampshire"|state=="new jersey"|state=="new mexico"|state=="new y > ork"|state=="north carolina"|state=="north dakota"|state=="ohio"|state=="okl > ahoma"|state=="oregon"|state=="pennsylvania"|state=="rhode island"|state=="s > outh carolina"|state=="south dakota"|state=="tennessee"|state=="texas"|state > =="utah"|state=="vermont"|state=="virginia"|state=="washington"|state=="west > virginia"|state=="wisconsin"|state=="wyoming" . . *data checks for columns summing to county total . gen temp=births_of_residents_of_area__att+births_of_residents_of_area__at0+b > irths_of_residents_of_area__at1+births_of_residents_of_area__at2 . list if temp!=births__total +-----------------------------------------------------------------+ 2653. | page__~_ | state | county | city_b~l | births~l | births~t | | 145 | new jersey | total | total | 76265 | 70513 | |-----------------------------------------------------------------| | births~0 | births~1 | births~2 | temp | | 4745 | 973 | 54 | 76285 | +-----------------------------------------------------------------+ . *no errors . drop temp . . *clean and label variables . rename page__of_pdf_ page_of_pdf . label var page_of_pdf "page of pdf" . label var state "state" . label var county "county" . rename city_balance_total sub_county . label var sub_county "city/balance/total" . rename births__total births . label var births "births by residence" . rename births_of_residents_of_area__att births_h_p . label var births_h "births by residence: physician (in hospital)" . rename births_of_residents_of_area__at0 births_nh_p . label var births_nh_p "births by residence: physician (not in hospital)" . rename births_of_residents_of_area__at1 births_m . label var births_m "births by residence: midwife" . rename births_of_residents_of_area__at2 births_o . label var births_o "births by residence: other and not specified" . . *generate year variable . gen year=1944 . label var year "year" . . *check that observations are unique . egen tag=tag(state county sub_county) . assert tag==1 . drop tag . . list in 1/1 +-----------------------------------------------------------------------+ 1. | page_o~f | state | county | sub_co~y | births | bir~_h_p | bir~nh_p | | 130 | alabama | total | total | 74415 | 29269 | 26617 | |-------------------------------------------------+---------------------| | births_m | births_o | year | | 18306 | 223 | 1944 | +-----------------------------------------------------------------------+ . desc Contains data from natality1944.dta obs: 4,885 vars: 10 size: 605,740 (94.2% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page_of_pdf int %8.0g page of pdf state str20 %20s state county str52 %52s county sub_county str26 %26s city/balance/total births long %12.0g births by residence births_h_p long %12.0g births by residence: physician (in hospital) births_nh_p long %12.0g births by residence: physician (not in hospital) births_m int %8.0g births by residence: midwife births_o int %8.0g births by residence: other and not specified year float %9.0g year ------------------------------------------------------------------------------ > - Sorted by: page_of_pdf Note: dataset has changed since last saved . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page_of_pdf | 4885 144.2583 7.985757 130 158 state | 0 county | 0 sub_county | 0 births | 4885 1481.751 8141.779 0 229534 -------------+-------------------------------------------------------- births_h_p | 4885 1152.418 6822.445 0 218085 births_nh_p | 4885 240.6364 1431.228 0 34276 births_m | 4885 83.99611 751.2707 0 23361 births_o | 4885 4.704401 42.38062 0 2355 year | 4885 1944 0 1944 1944 . saveold clean_natality1944.dta, replace file clean_natality1944.dta saved . clear . . ** . *1945 data . ** . *http://nber15.nber.org/vital-stats-books/vsus_1945_2.pdf . *table 28 (section b, counties) . . *hand-checked excel file for ?'s (difficult-to-read characters) . *corrected below if any . . clear . use natality1945.dta . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page__of_p~_ | 4885 477.5961 7.645452 464 491 state | 0 county | 0 city_balan~l | 0 births__to~l | 4882 1455.364 8120.335 1 234754 -------------+-------------------------------------------------------- births_of_~t | 4881 1174.145 6991.714 1 225644 births_of_~0 | 4808 198.8752 1195.39 1 28481 births_of_~1 | 2587 149.2462 971.2593 1 22032 births_of_~2 | 2184 9.837912 61.8076 1 2420 . desc Contains data from natality1945.dta obs: 4,885 vars: 9 size: 576,430 (94.5% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page__of_pdf_ int %8.0g state str20 %20s county str52 %52s city_balance_~l str26 %26s births__total long %12.0g births_of_res~t long %12.0g births_of_res~0 int %8.0g births_of_res~1 int %8.0g births_of_res~2 int %8.0g ------------------------------------------------------------------------------ > - Sorted by: . replace county=lower(county) (4885 real changes made) . replace state=lower(state) (4885 real changes made) . replace city_balance_total=lower(city_balance_total) (4885 real changes made) . . *missing observations in the original data are zeros . *replace these so we can distinguish these from "true" missing values . *which would be generated below due to difficult-to-read characters . count if births__total==. 3 . replace births__total=0 if births__total==. (3 real changes made) . count if births_of_residents_of_area__att==. 4 . replace births_of_residents_of_area__att=0 if births_of_residents_of_area__a > tt==. (4 real changes made) . count if births_of_residents_of_area__at0==. 77 . replace births_of_residents_of_area__at0=0 if births_of_residents_of_area__a > t0==. (77 real changes made) . count if births_of_residents_of_area__at1==. 2298 . replace births_of_residents_of_area__at1=0 if births_of_residents_of_area__a > t1==. (2298 real changes made) . count if births_of_residents_of_area__at2==. 2701 . replace births_of_residents_of_area__at2=0 if births_of_residents_of_area__a > t2==. (2701 real changes made) . . *check that all pdf pages appear to be in the data . sort page__of_pdf_ . gen temp=page__of_pdf_[_n]-page__of_pdf_[_n-1] (1 missing value generated) . assert temp==0|temp==1|temp==. . drop temp . . *clean data entry errors . replace births_of_residents_of_area__at0=111 if state=="mississippi" & count > y=="lafayette" & city_balance_total=="total" (1 real change made) . replace births_of_residents_of_area__at1=192 if state=="mississippi" & count > y=="lafayette" & city_balance_total=="total" (1 real change made) . . *fix county name misspellings (relative to county names in the 1970 census) . replace county=subinstr(county, ".", "",.) (53 real changes made) . replace county=subinstr(county, "ste ", "st ",.) (1 real change made) . replace county="desoto" if county=="de soto" & state=="florida" (1 real change made) . replace county="cochrane" if county=="cochran" & state=="texas" (1 real change made) . replace county="davie" if county=="davis" & state=="north carolina" (1 real change made) . replace county="san francisco" if county=="san francisco, coextensive with s > an francisco (city)" & state=="california" (1 real change made) . replace county="orleans" if county=="orleans, coextensive with new orleans" > & state=="louisiana" (1 real change made) . replace county="denver" if county=="denver, coextensive with denver (city)" > & state=="colorado" (1 real change made) . replace county="philadelphia" if county=="philadelphia, coextensive with phi > ladelphia (city)" & state=="pennsylvania" (1 real change made) . replace county="baltimore city" if county=="baltimore (city)" & state=="mary > land" (1 real change made) . replace county="st louis city" if county=="st louis (city)" & state=="missou > ri" (1 real change made) . . *correct data entry errors found while checking that county totals sum to st > ate totals . replace births_of_residents_of_area__att=203 if births_of_residents_of_area > __att==200 & state=="mississippi" & county=="alcorn" & city_balance_total==" > total" (1 real change made) . . *check that county names are consistent with 1970 census, except for known d > eviations . preserve . egen countyr_state=concat(county state), punct("_") . sort countyr_state . merge countyr_state using original_census1970_counties variable countyr_state does not uniquely identify observations in the master data variable countyr_state does not uniquely identify observations in original_census1970_counties.dta . assert _m==3|county=="armstrong" & state=="south dakota"|county=="washington > " & state=="south dakota"|county=="yellowstone national park (part)" & state > =="idaho"|county=="yellowstone national park" & state=="idaho"|county=="yell > owstone national park (part)" & state=="montana"|county=="yellowstone nation > al park (part)" & state=="wyoming"|county=="yellowstone national park (total > )" & state=="wyoming"|county=="yellowstone national park" & state=="montana" > |county=="park (excl yell nat park)" & state=="montana"|county=="yellowstone > national park, part" & state=="montana"|county=="yellowstone national park, > part" & state=="wyoming"|county=="yellowstone national park, part" & state= > ="montana"|county=="yellowstone national park, total" & state=="wyoming"|cou > nty=="park" & state=="montana"|county=="new york city" & state=="new york"|c > ounty=="bronx" & state=="new york"|county=="kings" & state=="new york"|count > y=="new york" & state=="new york"|county=="queens" & state=="new york"|count > y=="richmond" & state=="new york"|county=="ormsby" & state=="nevada"|county= > ="carson city" & state=="nevada"|county=="los alamos" & state=="new mexico"| > county=="menominee" & state=="wisconsin"|county=="total"|state=="alaska"|sta > te=="hawaii"|state=="virginia"|state=="dc" . restore . . *check for mistakes/misspellings in state names . assert state=="alabama"|state=="alaska"|state=="arizona"|state=="arkansas"|s > tate=="california"|state=="colorado"|state=="connecticut"|state=="delaware"| > state=="district of columbia"|state=="florida"|state=="georgia"|state=="hawa > ii"|state=="idaho"|state=="illinois"|state=="indiana"|state=="iowa"|state==" > kansas"|state=="kentucky"|state=="louisiana"|state=="maine"|state=="maryland > "|state=="massachusetts"|state=="michigan"|state=="minnesota"|state=="missis > sippi"|state=="missouri"|state=="montana"|state=="nebraska"|state=="nevada"| > state=="new hampshire"|state=="new jersey"|state=="new mexico"|state=="new y > ork"|state=="north carolina"|state=="north dakota"|state=="ohio"|state=="okl > ahoma"|state=="oregon"|state=="pennsylvania"|state=="rhode island"|state=="s > outh carolina"|state=="south dakota"|state=="tennessee"|state=="texas"|state > =="utah"|state=="vermont"|state=="virginia"|state=="washington"|state=="west > virginia"|state=="wisconsin"|state=="wyoming" . . *data checks for columns summing to county total . gen temp=births_of_residents_of_area__att+births_of_residents_of_area__at0+b > irths_of_residents_of_area__at1+births_of_residents_of_area__at2 . list if temp!=births__total +---------------------------------------------------------------+ 2166. | page__~_ | state | county | city_bala~l | births~l | | 476 | mississippi | alcorn | total | 575 | |--------------------------------------+------------------------| | births~t | births~0 | births~1 | births~2 | temp | | 203 | 296 | 78 | 1 | 578 | +---------------------------------------------------------------+ +---------------------------------------------------------------+ 4048. | page__~_ | state | county | city_bala~l | births~l | | 486 | texas | total | total | 167915 | |--------------------------------------+------------------------| | births~t | births~0 | births~1 | births~2 | temp | | 108520 | 28481 | 18494 | 2420 | 157915 | +---------------------------------------------------------------+ +---------------------------------------------------------------+ 4630. | page__~_ | state | county | city_bala~l | births~l | | 489 | washington | walla walla | total | 786 | |--------------------------------------+------------------------| | births~t | births~0 | births~1 | births~2 | temp | | 777 | 8 | 0 | 0 | 785 | +---------------------------------------------------------------+ +---------------------------------------------------------------+ 4631. | page__~_ | state | county | city_bala~l | births~l | | 489 | washington | walla walla | walla walla | 559 | |--------------------------------------+------------------------| | births~t | births~0 | births~1 | births~2 | temp | | 554 | 5 | 0 | 1 | 560 | +---------------------------------------------------------------+ . *checked .pdf, these are data errors not data entry errors . drop temp . . *clean and label variables . rename page__of_pdf_ page_of_pdf . label var page_of_pdf "page of pdf" . label var state "state" . label var county "county" . rename city_balance_total sub_county . label var sub_county "city/balance/total" . rename births__total births . label var births "births by residence" . rename births_of_residents_of_area__att births_h_p . label var births_h "births by residence: physician (in hospital)" . rename births_of_residents_of_area__at0 births_nh_p . label var births_nh_p "births by residence: physician (not in hospital)" . rename births_of_residents_of_area__at1 births_m . label var births_m "births by residence: midwife" . rename births_of_residents_of_area__at2 births_o . label var births_o "births by residence: other and not specified" . . *generate year variable . gen year=1945 . label var year "year" . . *check that observations are unique . egen tag=tag(state county sub_county) . assert tag==1 . drop tag . . list in 1/1 +-----------------------------------------------------------------------+ 1. | page_o~f | state | county | sub_co~y | births | bir~_h_p | bir~nh_p | | 464 | alabama | total | total | 70321 | 30300 | 22179 | |-------------------------------------------------+---------------------| | births_m | births_o | year | | 17686 | 156 | 1945 | +-----------------------------------------------------------------------+ . desc Contains data from natality1945.dta obs: 4,885 vars: 10 size: 595,970 (94.3% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page_of_pdf int %8.0g page of pdf state str20 %20s state county str52 %52s county sub_county str26 %26s city/balance/total births long %12.0g births by residence births_h_p long %12.0g births by residence: physician (in hospital) births_nh_p int %8.0g births by residence: physician (not in hospital) births_m int %8.0g births by residence: midwife births_o int %8.0g births by residence: other and not specified year float %9.0g year ------------------------------------------------------------------------------ > - Sorted by: page_of_pdf Note: dataset has changed since last saved . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page_of_pdf | 4885 477.5961 7.645452 464 491 state | 0 county | 0 sub_county | 0 births | 4885 1454.47 8117.921 0 234754 -------------+-------------------------------------------------------- births_h_p | 4885 1173.185 6988.931 0 225644 births_nh_p | 4885 195.7632 1186.185 0 28481 births_m | 4885 79.07718 710.6599 0 22032 births_o | 4885 4.398362 41.61051 0 2420 year | 4885 1945 0 1945 1945 . saveold clean_natality1945.dta, replace file clean_natality1945.dta saved . clear . . ** . *1946 data . ** . *http://nber15.nber.org/vital-stats-books/vsus_1946_2.cv.pdf . *table 2 (section b, counties) . . *hand-checked excel file for ?'s (difficult-to-read characters) . *corrected below if any . . clear . use natality1946.dta . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page__of_p~_ | 7513 65.61121 26.42683 20 111 state | 0 county | 0 city_balan~l | 0 race | 0 -------------+-------------------------------------------------------- births__to~l | 7503 1393.216 8169.135 2 286546 births_of_~t | 7466 1161.149 7330.507 1 277629 births_of_~0 | 7379 160.9046 983.0692 1 30765 births_of_~1 | 4783 118.2848 728.3526 1 22600 . desc Contains data from natality1946.dta obs: 7,513 vars: 9 size: 931,612 (91.1% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page__of_pdf_ int %8.0g state str20 %20s county str52 %52s city_balance_~l str26 %26s race str8 %8s births__total long %12.0g births_of_res~t long %12.0g births_of_res~0 int %8.0g births_of_res~1 int %8.0g ------------------------------------------------------------------------------ > - Sorted by: . replace county=lower(county) (7512 real changes made) . replace state=lower(state) (7513 real changes made) . replace city_balance_total=lower(city_balance_total) (7513 real changes made) . replace race=lower(race) (7513 real changes made) . . *missing observations in the original data are zeros . *replace these so we can distinguish these from "true" missing values . *which would be generated below due to difficult-to-read characters . count if births__total==. 10 . replace births__total=0 if births__total==. (10 real changes made) . count if births_of_residents_of_area__att==. 47 . replace births_of_residents_of_area__att=0 if births_of_residents_of_area__a > tt==. (47 real changes made) . count if births_of_residents_of_area__at0==. 134 . replace births_of_residents_of_area__at0=0 if births_of_residents_of_area__a > t0==. (134 real changes made) . count if births_of_residents_of_area__at1==. 2730 . replace births_of_residents_of_area__at1=0 if births_of_residents_of_area__a > t1==. (2730 real changes made) . . *check that all pdf pages appear to be in the data . sort page__of_pdf_ . gen temp=page__of_pdf_[_n]-page__of_pdf_[_n-1] (1 missing value generated) . assert temp==0|temp==1|temp==. . drop temp . . *clean data entry errors . replace county="oscoda" if county=="osceola" & state=="michigan" & births__t > otal==63 (1 real change made) . replace city_balance_total="atlanta (part)" if city_balance_total=="atlanta" > & county=="fulton" & state=="georgia" & race=="white" & births__total==6020 (1 real change made) . replace city_balance_total="atlanta (part)" if city_balance_total=="atlanta" > & county=="fulton" & state=="georgia" & race=="nonwhite" & births__total==2 > 582 (1 real change made) . replace city_balance_total="atlanta (total)" if city_balance_total=="atlanta > " & county=="fulton" & state=="georgia" & race=="white" & births__total==695 > 1 (1 real change made) . replace city_balance_total="atlanta (total)" if city_balance_total=="atlanta > " & county=="fulton" & state=="georgia" & race=="nonwhite" & births__total== > 2648 (1 real change made) . replace city_balance_total="lafayette" if city_balance_total=="total" & coun > ty=="lafayette" & state=="louisiana" & race=="white" & births__total==474 (1 real change made) . replace city_balance_total="lafayette" if city_balance_total=="total" & coun > ty=="lafayette" & state=="louisiana" & race=="nonwhite" & births__total==214 (1 real change made) . replace city_balance_total="rocky mount (part)" if city_balance_total=="tota > l" & county=="edgecombe" & state=="north carolina" & race=="white" & births_ > _total==160 (1 real change made) . replace city_balance_total="rocky mount (part)" if city_balance_total=="tota > l" & county=="edgecombe" & state=="north carolina" & race=="nonwhite" & birt > hs__total==132 (1 real change made) . replace city_balance_total="balance of county" if city_balance_total=="total > " & county=="gaston" & state=="north carolina" & race=="white" & births__tot > al==1686 (1 real change made) . replace city_balance_total="balance of county" if city_balance_total=="total > " & county=="gaston" & state=="north carolina" & race=="nonwhite" & births__ > total==228 (1 real change made) . replace city_balance_total="florence" if city_balance_total=="total" & count > y=="florence" & state=="south carolina" & race=="white" & births__total==365 (1 real change made) . replace city_balance_total="florence" if city_balance_total=="total" & count > y=="florence" & state=="south carolina" & race=="nonwhite" & births__total== > 214 (1 real change made) . replace city_balance_total="navarro" if city_balance_total=="total" & count > y=="navarro" & state=="texas" & race=="white" & births__total==339 (1 real change made) . replace city_balance_total="navarro" if city_balance_total=="total" & count > y=="navarro" & state=="texas" & race=="nonwhite" & births__total==138 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="total" & births__total==4037 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="white" & births__total==2778 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="nonwhite" & births__total==1259 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="total" & births__total==5272 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="white" & births__total==3745 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="nonwhite" & births__total==1527 (1 real change made) . replace county="roanoke (ind. city)" if county=="roanoke" & state=="virginia > " & race=="total" & births__total==1628 (1 real change made) . replace county="roanoke (ind. city)" if county=="roanoke" & state=="virginia > " & race=="white" & births__total==1317 (1 real change made) . replace county="roanoke (ind. city)" if county=="roanoke" & state=="virginia > " & race=="nonwhite" & births__total==311 (1 real change made) . . *fix county name misspellings (relative to county names in the 1970 census) . replace county=subinstr(county, ".", "",.) (100 real changes made) . replace county=subinstr(county, "ste ", "st ",.) (1 real change made) . replace county="desoto" if county=="de soto" & state=="florida" (3 real changes made) . replace county="cochrane" if county=="cochran" & state=="texas" (1 real change made) . replace county="san francisco" if county=="san francisco, coextensive with s > an francisco (city)" & state=="california" (3 real changes made) . replace county="orleans" if county=="orleans, coextensive with new orleans" > & state=="louisiana" (3 real changes made) . replace county="denver" if county=="denver, coextensive with denver (city)" > & state=="colorado" (1 real change made) . replace county="philadelphia" if county=="philadelphia, coextensive with phi > ladelphia (city)" & state=="pennsylvania" (3 real changes made) . replace county="baltimore city" if county=="baltimore (city)" & state=="mary > land" (3 real changes made) . replace county="st louis city" if county=="st louis (city)" & state=="missou > ri" (3 real changes made) . replace county="chisago" if county=="chicago" & state=="minnesota" (1 real change made) . replace county="clark" if county=="clarke" & state=="ohio" (5 real changes made) . replace county="cook" if county=="cooke" & state=="illinois" (23 real changes made) . replace county="delaware" if county=="delavare" & state=="oklahoma" (3 real changes made) . replace county="hancock" if county=="hencock" & state=="mississippi" (3 real changes made) . replace county="love" if county=="lowe" & state=="oklahoma" (1 real change made) . replace county="nobles" if county=="nohles" & state=="minnesota" (1 real change made) . replace county="orleans" if county=="orleane" & state=="vermont" (1 real change made) . replace county="otoe" if county=="otos" & state=="nebraska" (1 real change made) . replace county="penobscot" if county=="penchscot" & state=="maine" (3 real changes made) . replace county="pittsburg" if county=="pittaburg" & state=="oklahoma" (9 real changes made) . replace county="platte" if county=="platts" & state=="nebraska" (1 real change made) . replace county="sherburne" if county=="shorburne" & state=="minnesota" (3 real changes made) . replace county="texas" if county=="teras" & state=="oklahoma" (1 real change made) . replace county="wagoner" if county=="wegoner" & state=="oklahoma" (3 real changes made) . replace county="washington" if county=="weshington" & state=="oklahoma" (3 real changes made) . . *correct data entry errors found while checking that county totals sum to st > ate totals . replace births__total=1878 if births__total==1678 & state=="alabama" > & county=="talladega" & city_balance_total=="total" & race=="t > otal" (1 real change made) . replace births__total=418 if births__total==416 & state=="arkansas" > & county=="drew" & city_balance_total=="total" & race=="t > otal" (1 real change made) . replace births__total=13787 if births__total==13767 & state=="idaho" > & county=="total" & city_balance_total=="total" & > race=="total" (1 real change made) . replace births__total=482 if births__total==492 & state=="illinois" > & county=="perry" & city_balance_total=="total" & race=="t > otal" (1 real change made) . replace births__total=478 if births__total==476 & state=="iowa" > & county=="dallas" & city_balance_total=="total" & > race=="total" (1 real change made) . replace births__total=265 if births__total==285 & state=="kentucky" > & county=="green" & city_balance_total=="total" & race=="t > otal" (1 real change made) . replace births__total=379 if births__total==378 & state=="kentucky" > & county=="rowan" & city_balance_total=="total" & race=="t > otal" (1 real change made) . replace births__total=862 if births__total==962 & state=="michigan" > & county=="gratiot" & city_balance_total=="total" & race=="t > otal" (1 real change made) . replace births__total=2228 if births__total==2226 & state=="missouri" > & county=="greene" & city_balance_total=="total" & race=="t > otal" (1 real change made) . replace births__total=734 if births__total==754 & state=="new york" > & county=="tioga" & city_balance_total=="total" & race=="t > otal" (1 real change made) . replace births__total=1182 if births__total==1162 & state=="oregon" > & county=="klamath" & city_balance_total=="total" & > race=="total" (1 real change made) . replace births__total=887 if births__total==987 & state=="texas" > & county=="williamson" & city_balance_total=="total" & race=="t > otal" (1 real change made) . replace births__total=563 if births__total==1126 & state=="utah" > & county=="box elder" & city_balance_total=="total" & race=="t > otal" (1 real change made) . replace births__total=1296 if births__total==1295 & state=="wisconsin" > & county=="eau claire" & city_balance_total=="total" & race=="total" (1 real change made) . replace births__total=86 if births__total==85 & state=="wisconsin" > & county=="florence" & city_balance_total=="total" & race=="total" (1 real change made) . replace births__total=551 if births__total==651 & state=="wyoming" > & county=="sweetwater" & city_balance_total=="total" & race=="t > otal" (1 real change made) . replace births_of_residents_of_area__att=138 if births_of_residents_of_ar > ea__att==139 & state=="florida" & county=="washington" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=123 if births_of_residents_of_ar > ea__att==125 & state=="georgia" & county=="mcduffie" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=288 if births_of_residents_of_ar > ea__att==298 & state=="georgia" & county=="wayne" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=428 if births_of_residents_of_ar > ea__att==426 & state=="iowa" & county=="dallas" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=258 if births_of_residents_of_ar > ea__att==259 & state=="iowa" & county=="humboldt" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=365 if births_of_residents_of_ar > ea__att==366 & state=="louisiana" & county=="franklin" & city_b > alance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=1328 if births_of_residents_of_ar > ea__att==1329 & state=="maryland" & county=="washington" & city_b > alance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=7105 if births_of_residents_of_ar > ea__att==7106 & state=="massachusetts" & county=="norfolk" & city_b > alance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=318 if births_of_residents_of_ar > ea__att==319 & state=="minnesota" & county=="mille lacs" & city_b > alance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=628 if births_of_residents_of_ar > ea__att==629 & state=="minnesota" & county=="morrison" & city_b > alance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=66 if births_of_residents_of_ar > ea__att==86 & state=="montana" & county=="mccone" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=126 if births_of_residents_of_ar > ea__att==128 & state=="montana" & county=="phillips" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=175 if births_of_residents_of_ar > ea__att==176 & state=="montana" & county=="pondera" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=438 if births_of_residents_of_ar > ea__att==439 & state=="nebraska" & county=="madison" & city_b > alance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=9448 if births_of_residents_of_ar > ea__att==9449 & state=="new jersey" & county=="bergen" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=37 if births_of_residents_of_ar > ea__att==57 & state=="new mexico" & county=="catron" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=119 if births_of_residents_of_ar > ea__att==118 & state=="new mexico" & county=="union" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=4135 if births_of_residents_of_ar > ea__att==4136 & state=="new york" & county=="broome" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=2423 if births_of_residents_of_ar > ea__att==2425 & state=="new york" & county=="chautauqua" & city_b > alance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=386 if births_of_residents_of_ar > ea__att==366 & state=="north dakota" & county=="barnes" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=106 if births_of_residents_of_ar > ea__att==108 & state=="oklahoma" & county=="dewey" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=483 if births_of_residents_of_ar > ea__att==493 & state=="south carolina" & county=="lexington" & city_b > alance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=580 if births_of_residents_of_ar > ea__att==590 & state=="south carolina" & county=="marion" & city_b > alance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=5738 if births_of_residents_of_ar > ea__att==5758 & state=="tennessee" & county=="davidson" & city_b > alance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=38 if births_of_residents_of_ar > ea__att==39 & state=="texas" & county=="kimble" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=561 if births_of_residents_of_ar > ea__att==563 & state=="utah" & county=="box elder" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=289 if births_of_residents_of_ar > ea__att==299 & state=="west virginia" & county=="greenbrier" & city_b > alance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=338 if births_of_residents_of_ar > ea__att==339 & state=="wyoming" & county=="big horn" > & city_balance_total=="total" & race=="total" (1 real change made) . . *correct data entry errors found while checking that white+nonwhite=total . replace births__total=228 if births__total==226 & state=="mississippi" & co > unty=="yalobusha" & race=="nonwhite" & city_balance_total=="total" (1 real change made) . replace births__total=105 if births__total==103 & state=="montana" > & county=="rosebud" & race=="white" & city_balance_total=="t > otal" (1 real change made) . replace births__total=368 if births__total==369 & state=="new jersey" & co > unty=="atlantic" & race=="nonwhite" & city_balance_total=="total" (1 real change made) . replace births_of_residents_of_area__att=56 if births_of_residents_of_ar > ea__att==66 & state=="louisiana" & county=="assumption" > & race=="nonwhite" & city_balance_total=="total" (1 real change made) . replace births_of_residents_of_area__att=268 if births_of_residents_of_ar > ea__att==269 & state=="south carolina" & county=="colleton" > & race=="white" & city_balance_total=="total" (1 real change made) . . *check that county names are consistent with 1970 census, except for known d > eviations . preserve . egen countyr_state=concat(county state), punct("_") . sort countyr_state . merge countyr_state using original_census1970_counties variable countyr_state does not uniquely identify observations in the master data variable countyr_state does not uniquely identify observations in original_census1970_counties.dta . assert _m==3|county=="armstrong" & state=="south dakota"|county=="washington > " & state=="south dakota"|county=="yellowstone national park (part)" & state > =="idaho"|county=="yellowstone national park" & state=="idaho"|county=="yell > owstone national park (part)" & state=="montana"|county=="yellowstone nation > al park (part)" & state=="wyoming"|county=="yellowstone national park (total > )" & state=="wyoming"|county=="yellowstone national park" & state=="montana" > |county=="park (excl yell nat park)" & state=="montana"|county=="yellowstone > national park, part" & state=="montana"|county=="yellowstone national park, > part" & state=="wyoming"|county=="yellowstone national park, part" & state= > ="montana"|county=="yellowstone national park, total" & state=="wyoming"|cou > nty=="park" & state=="montana"|county=="new york city" & state=="new york"|c > ounty=="bronx" & state=="new york"|county=="kings" & state=="new york"|count > y=="new york" & state=="new york"|county=="queens" & state=="new york"|count > y=="richmond" & state=="new york"|county=="ormsby" & state=="nevada"|county= > ="carson city" & state=="nevada"|county=="los alamos" & state=="new mexico"| > county=="menominee" & state=="wisconsin"|county=="total"|state=="alaska"|sta > te=="hawaii"|state=="virginia"|state=="dc" . restore . . *check for mistakes/misspellings in state names . assert state=="alabama"|state=="alaska"|state=="arizona"|state=="arkansas"|s > tate=="california"|state=="colorado"|state=="connecticut"|state=="delaware"| > state=="district of columbia"|state=="florida"|state=="georgia"|state=="hawa > ii"|state=="idaho"|state=="illinois"|state=="indiana"|state=="iowa"|state==" > kansas"|state=="kentucky"|state=="louisiana"|state=="maine"|state=="maryland > "|state=="massachusetts"|state=="michigan"|state=="minnesota"|state=="missis > sippi"|state=="missouri"|state=="montana"|state=="nebraska"|state=="nevada"| > state=="new hampshire"|state=="new jersey"|state=="new mexico"|state=="new y > ork"|state=="north carolina"|state=="north dakota"|state=="ohio"|state=="okl > ahoma"|state=="oregon"|state=="pennsylvania"|state=="rhode island"|state=="s > outh carolina"|state=="south dakota"|state=="tennessee"|state=="texas"|state > =="utah"|state=="vermont"|state=="virginia"|state=="washington"|state=="west > virginia"|state=="wisconsin"|state=="wyoming" . . *data checks for columns summing to county total . *not applicable; see footnote 1 . . *clean and label variables . rename page__of_pdf_ page_of_pdf . label var page_of_pdf "page of pdf" . label var state "state" . label var county "county" . rename city_balance_total sub_county . label var sub_county "city/balance/total" . label var race "race" . rename births__total births . label var births "births by residence" . rename births_of_residents_of_area__att births_h_p . label var births_h "births by residence: physician (in hospital)" . rename births_of_residents_of_area__at0 births_nh_p . label var births_nh_p "births by residence: physician (not in hospital)" . rename births_of_residents_of_area__at1 births_m . label var births_m "births by residence: midwife" . . *generate year variable . gen year=1946 . label var year "year" . . *check that observations are unique . egen tag=tag(state county sub_county race) . assert tag==1 . drop tag . . list in 1/1 +--------------------------------------------------------------------+ 1. | page_o~f | state | county | sub_co~y | race | births | bir~_h_p | | 20 | alabama | total | total | total | 79863 | 39317 | |--------------------------------------------------------------------| | bir~nh_p | births_m | year | | 22310 | 18067 | 1946 | +--------------------------------------------------------------------+ . desc Contains data from natality1946.dta obs: 7,513 vars: 10 size: 961,664 (90.8% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page_of_pdf int %8.0g page of pdf state str20 %20s state county str52 %52s county sub_county str26 %26s city/balance/total race str8 %8s race births long %12.0g births by residence births_h_p long %12.0g births by residence: physician (in hospital) births_nh_p int %8.0g births by residence: physician (not in hospital) births_m int %8.0g births by residence: midwife year float %9.0g year ------------------------------------------------------------------------------ > - Sorted by: page_of_pdf Note: dataset has changed since last saved . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page_of_pdf | 7513 65.61121 26.42683 20 111 state | 0 county | 0 sub_county | 0 race | 0 -------------+-------------------------------------------------------- births | 7513 1391.274 8163.868 0 286546 births_h_p | 7513 1153.87 7308.112 0 277629 births_nh_p | 7513 158.0347 974.4944 0 30765 births_m | 7513 75.30361 583.9023 0 22600 year | 7513 1946 0 1946 1946 . saveold clean_natality1946.dta, replace file clean_natality1946.dta saved . clear . . ** . *1947 data . ** . *http://nber15.nber.org/vital-stats-books/vsus_1947_2.cv.pdf . *table 1 . . *hand-checked excel file for ?'s (difficult-to-read characters) . *corrected below if any . . clear . use natality1947.dta . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page__of_p~_ | 7606 47.5848 26.42533 2 93 state | 0 county | 0 city_balan~l | 0 race | 0 -------------+-------------------------------------------------------- births__to~l | 7596 2033.006 11799.38 -786 323250 births_att~_ | 7573 1735.844 10799.37 -439 315008 births_att~t | 7449 197.9707 1202.921 1 29837 births_att~e | 4818 160.3412 1023.076 1 25675 . desc Contains data from natality1947.dta obs: 7,606 vars: 9 size: 950,750 (90.9% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page__of_pdf_ byte %8.0g state str24 %24s county str50 %50s city_balance_~l str26 %26s race str8 %8s births__total long %12.0g births_attend~_ long %12.0g births_attend~t int %8.0g births_attend~e int %8.0g ------------------------------------------------------------------------------ > - Sorted by: . replace county=lower(county) (7606 real changes made) . replace state=lower(state) (7606 real changes made) . replace city_balance_total=lower(city_balance_total) (7606 real changes made) . replace race=lower(race) (7606 real changes made) . . *missing observations in the original data are zeros . *replace these so we can distinguish these from "true" missing values . *which would be generated below due to difficult-to-read characters . count if births__total==. 10 . replace births__total=0 if births__total==. (10 real changes made) . count if births_attended_by_physician_in_==. 33 . replace births_attended_by_physician_in_=0 if births_attended_by_physician_i > n_==. (33 real changes made) . count if births_attended_by_physician_not==. 157 . replace births_attended_by_physician_not=0 if births_attended_by_physician_n > ot==. (157 real changes made) . count if births_attended_by_midwife==. 2788 . replace births_attended_by_midwife=0 if births_attended_by_midwife==. (2788 real changes made) . . *check that all pdf pages appear to be in the data . sort page__of_pdf_ . gen temp=page__of_pdf_[_n]-page__of_pdf_[_n-1] (1 missing value generated) . assert temp==0|temp==1|temp==. . drop temp . . *clean data entry errors . append using natality1947_append.dta . *one county that was not originally entered . replace county=lower(county) (3 real changes made) . replace state=lower(state) (3 real changes made) . replace city_balance_total=lower(city_balance_total) (3 real changes made) . replace race=lower(race) (3 real changes made) . replace county="oscoda" if county=="ouceola" & state=="michigan" & births__t > otal==84 (1 real change made) . replace state="virginia" if state=="vieginia" (24 real changes made) . replace state="virginia" if state=="virginia-cont." (1 real change made) . replace state="virginia" if state=="independent cities" (32 real changes made) . replace state="virginia" if state=="independent cities-cont." (31 real changes made) . replace state="west virginia" if state=="west vitginia" (36 real changes made) . replace city_balance_total="district 1511, center hill" if state=="georgia" > & county=="fulton" & city_balance_total=="total" & race=="white" & births__t > otal==0 (1 real change made) . replace city_balance_total="district 1511, center hill" if state=="georgia" > & county=="fulton" & city_balance_total=="total" & race=="nonwhite" & births > __total==0 (1 real change made) . replace city_balance_total="centralia" if city_balance_total=="centrulia" & > county=="clinton" & state=="illinois" (1 real change made) . replace births__total=834 if state=="massachusetts" & county=="middlesex" & > city_balance_total=="arlington (town)" & race=="total" (1 real change made) . replace births__total=786 if state=="california" & county=="san mateo" & cit > y_balance_total=="san mateo" & race=="total" (1 real change made) . replace births_attended_by_physician_in_=439 if state=="michigan" & county== > "macomb" & city_balance_total=="st. clair shores" & race=="total" (1 real change made) . replace births__total=382 if state=="illinois" & county=="boone" & city_bala > nce_total=="total" & race=="total" (1 real change made) . replace births_attended_by_physician_in_=866 if state=="indiana" & county==" > bartholomew" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_attended_by_physician_in_=332 if state=="iowa" & county=="chi > ckasew" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_attended_by_physician_in_=1832 if state=="michigan" & county= > ="monroe" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_attended_by_midwife=72 if state=="arizona" & county=="santa c > ruz" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_attended_by_midwife=5 if state=="arizona" & county=="yavapai" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_attended_by_physician_not=38 if state=="michigan" & county==" > kalkaska" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_attended_by_physician_in_=4891 if state=="michigan" & county= > ="kent" & city_balance_total=="grend rapids" & race=="total" (1 real change made) . replace births_attended_by_physician_in_=266 if state=="ohio" & county=="mah > oning" & city_balance_total=="struthers" & race=="total" (1 real change made) . replace births_attended_by_physician_not=40 if state=="oklahoma" & county==" > kay" & city_balance_total=="total" & race=="total" (1 real change made) . replace births__total=820 if state=="illinois" & county=="kane" & city_balan > ce_total=="elgin" & race=="total" & births__total==620 (1 real change made) . replace births__total=1473 if state=="florida" & city_balance_total=="pensac > ola" & race=="total" & births__total==1472 (1 real change made) . replace births__total=738 if state=="louisiana" & county=="avoyelles" & city > _balance_total=="total" & births__total==739 (1 real change made) . replace births__total=1387 if state=="louisiana" & county=="caddo" & city_ba > lance_total=="shreveport" & births__total==1397 (1 real change made) . replace births__total=86 if state=="michigan" & county=="lake" & city_balanc > e_total=="total" & race=="white" & births__total==96 (1 real change made) . replace births__total=10539 if state=="ohio" & county=="hamilton" & city_bal > ance_total=="cincinnati" & race=="white" & births__total==10639 (1 real change made) . replace births__total=1288 if state=="south carolina" & county=="orangeburg" > & city_balance_total=="balance" & race=="nonwhite" & births__total==1298 (1 real change made) . replace births_attended_by_physician_in_=482 if county=="lawrence" & state== > "alabama" & city_balance_total=="total" & race=="white" (1 real change made) . replace births__total=2318 if state=="california" & county=="fresno" & city_ > balance_total=="fresno" & race=="total" (1 real change made) . replace births__total=1080 if state=="california" & county=="los angeles" & > city_balance_total=="south gate" & race=="total" (1 real change made) . replace births__total=1528 if state=="california" & county=="monterey" & cit > y_balance_total=="balance" & race=="total" (1 real change made) . replace births__total=1143 if state=="colorado" & county=="weld" & city_bala > nce_total=="balance" & race=="total" (1 real change made) . replace births__total=418 if state=="connecticut" & county=="new haven" & ci > ty_balance_total=="ansonia" & race=="total" (1 real change made) . replace births__total=582 if state=="idaho" & county=="ada" & city_balance_t > otal=="balance" & race=="total" (1 real change made) . replace births__total=865 if state=="illinois" & county=="champaign" & city_ > balance_total=="champaign" & race=="total" (1 real change made) . replace births__total=914 if state=="illinois" & county=="champaign" & city_ > balance_total=="balance" & race=="total" (1 real change made) . replace births__total=284 if state=="illinois" & county=="fulton" & city_bal > ance_total=="canton" & race=="total" (1 real change made) . replace births__total=816 if state=="michigan" & county=="monroe" & city_bal > ance_total=="monroe" & race=="total" (1 real change made) . replace births_attended_by_physician_in_=803 if state=="michigan" & county== > "monroe" & city_balance_total=="monroe" & race=="total" (1 real change made) . replace births__total=275 if state=="new jersey" & county=="passaic" & city_ > balance_total=="hawthorne" & race=="total" (1 real change made) . replace births__total=1739 if state=="new york" & county=="saratoga" & city_ > balance_total=="total" & race=="total" (1 real change made) . replace births_attended_by_physician_in_=468 if state=="california" & county > =="los angeles" & city_balance_total=="total" & race=="nonwhite" & births_at > tended_by_physician_in_==458 (1 real change made) . replace births_attended_by_physician_in_=38 if state=="florida" & county=="p > olk" & city_balance_total=="lakeland" & race=="nonwhite" & births_attended_b > y_physician_in_==39 (1 real change made) . replace births_attended_by_physician_in_=18 if state=="georgia" & county=="c > olquitt" & city_balance_total=="total" & race=="nonwhite" & births_attended_ > by_physician_in_==16 (1 real change made) . replace births_attended_by_physician_in_=486 if state=="georgia" & county==" > colquitt" & city_balance_total=="balance" & race=="total" & births_attended_ > by_physician_in_==466 (1 real change made) . replace births_attended_by_physician_in_=628 if state=="illinois" & county== > "st. clair" & city_balance_total=="total" & race=="nonwhite" & births_attend > ed_by_physician_in_==629 (1 real change made) . replace births_attended_by_physician_in_=853 if state=="louisiana" & county= > ="st. landry" & city_balance_total=="total" & race=="white" & births_attende > d_by_physician_in_==653 (1 real change made) . replace births_attended_by_physician_in_=218 if state=="mississippi" & count > y=="coahoma" & city_balance_total=="clarksdele" & race=="total" & births_att > ended_by_physician_in_==219 (1 real change made) . replace births_attended_by_physician_in_=189 if state=="texas" & county=="an > derson" & city_balance_total=="palestine" & race=="white" & births_attended_ > by_physician_in_==129 (1 real change made) . replace births_attended_by_physician_in_=1086 if state=="california" & count > y=="los angeles" & city_balance_total=="compton" & race=="total" & births_at > tended_by_physician_in_==1096 (1 real change made) . replace births_attended_by_physician_in_=502 if state=="california" & county > =="los angeles" & city_balance_total=="san gabriel" & race=="total" & births > _attended_by_physician_in_==602 (1 real change made) . replace births_attended_by_physician_in_=2895 if state=="california" & count > y=="san mateo" & city_balance_total=="balance" & race=="total" & births_atte > nded_by_physician_in_==2695 (1 real change made) . replace births_attended_by_physician_in_=611 if state=="illinois" & county== > "jefferson" & city_balance_total=="total" & race=="total" & births_attended_ > by_physician_in_==811 (1 real change made) . replace births_attended_by_physician_in_=1148 if state=="iowa" & county=="ce > rro gordo" & city_balance_total=="total" & race=="total" & births_attended_b > y_physician_in_==1149 (1 real change made) . replace births_attended_by_physician_in_=491 if state=="massachusetts" & cou > nty=="middlesex" & city_balance_total=="wakefield (town)" & race=="total" & > births_attended_by_physician_in_==493 (1 real change made) . replace births_attended_by_physician_in_=889 if state=="michigan" & county== > "kalamazoo" & city_balance_total=="balance" & race=="total" & births_attende > d_by_physician_in_==899 (1 real change made) . replace births_attended_by_physician_in_=4881 if state=="michigan" & county= > ="kent" & city_balance_total=="grend rapids" & race=="total" & births_attend > ed_by_physician_in_==4891 (1 real change made) . replace births_attended_by_physician_in_=498 if state=="michigan" & county== > "midlend" & city_balance_total=="balance" & race=="total" & births_attended_ > by_physician_in_==499 (1 real change made) . replace births_attended_by_physician_in_=1888 if state=="michigan" & county= > ="muskegon" & city_balance_total=="muskegon" & race=="total" & births_attend > ed_by_physician_in_==1968 (1 real change made) . replace births_attended_by_physician_in_=1584 if state=="minnesota" & county > =="st. loufa" & city_balance_total=="balance" & race=="total" & births_atten > ded_by_physician_in_==1684 (1 real change made) . replace births_attended_by_physician_in_=158 if state=="nebraska" & county== > "hall" & city_balance_total=="balance" & race=="total" & births_attended_by_ > physician_in_==159 (1 real change made) . replace births_attended_by_physician_in_=319 if state=="south dakota" & coun > ty=="davison" & city_balance_total=="mitohell" & race=="total" & births_atte > nded_by_physician_in_==318 (1 real change made) . replace births_attended_by_physician_in_=1468 if state=="washington" & count > y=="snohomish" & city_balance_total=="balance" & race=="total" & births_atte > nded_by_physician_in_==1469 (1 real change made) . replace city_balance_total="balance" if race=="nonwhite" & county=="pima" & > births__total==147 (1 real change made) . replace city_balance_total="balance" if race=="white" & county=="pima" & bir > ths__total==403 (1 real change made) . replace city_balance_total="el centro" if race=="nonwhite" & county=="imperi > al" & births__total==49 (1 real change made) . replace city_balance_total="el centro" if race=="white" & county=="imperial" > & births__total==337 (1 real change made) . replace city_balance_total="balance" if race=="nonwhite" & county=="imperial > " & births__total==110 (1 real change made) . replace city_balance_total="balance" if race=="white" & county=="imperial" & > births__total==1027 (1 real change made) . replace city_balance_total="balance" if race=="nonwhite" & county=="los ange > les" & births__total==524 (1 real change made) . replace city_balance_total="balance" if race=="white" & county=="los angeles > " & births__total==17919 (1 real change made) . replace city_balance_total="los angeles" if race=="nonwhite" & county=="los > angeles" & births__total==4971 (1 real change made) . replace city_balance_total="los angeles" if race=="white" & county=="los ang > eles" & births__total==42064 (1 real change made) . replace city_balance_total="moultric" if race=="nonwhite" & county=="colquit > t" & births__total==133 (1 real change made) . replace city_balance_total="moultric" if race=="white" & county=="colquitt" > & births__total==261 (1 real change made) . replace city_balance_total="balance" if race=="nonwhite" & county=="colquitt > " & births__total==120 (1 real change made) . replace city_balance_total="balance" if race=="white" & county=="colquitt" & > births__total==563 (1 real change made) . replace city_balance_total="atlanta (part)" if race=="white" & county=="de k > alb" & births__total==859 (1 real change made) . replace city_balance_total="atlanta (part)" if race=="nonwhite" & county=="d > e kalb" & births__total==82 (1 real change made) . replace city_balance_total="decatur" if race=="white" & county=="de kalb" & > births__total==464 (1 real change made) . replace city_balance_total="decatur" if race=="nonwhite" & county=="de kalb" > & births__total==106 (1 real change made) . replace city_balance_total="balance" if race=="white" & county=="de kalb" & > births__total==1355 (1 real change made) . replace city_balance_total="balance" if race=="nonwhite" & county=="de kalb" > & births__total==251 (1 real change made) . replace city_balance_total="albany" if race=="white" & county=="dougherty" & > births__total==491 (1 real change made) . replace city_balance_total="albany" if race=="nonwhite" & county=="dougherty > " & births__total==338 (1 real change made) . replace city_balance_total="balance" if race=="white" & county=="dougherty" > & births__total==90 (1 real change made) . replace city_balance_total="balance" if race=="nonwhite" & county=="doughert > y" & births__total==111 (1 real change made) . replace city_balance_total="rome" if race=="white" & county=="floyd" & birth > s__total==801 (1 real change made) . replace city_balance_total="rome" if race=="nonwhite" & county=="floyd" & bi > rths__total==172 (1 real change made) . replace city_balance_total="balance" if race=="white" & county=="floyd" & bi > rths__total==692 (1 real change made) . replace city_balance_total="balance" if race=="nonwhite" & county=="floyd" & > births__total==49 (1 real change made) . replace city_balance_total="atlanta (part)" if race=="white" & county=="fult > on" & births__total==5204 (1 real change made) . replace city_balance_total="atlanta (part)" if race=="nonwhite" & county=="f > ulton" & births__total==3020 (1 real change made) . replace city_balance_total="atlanta (total)" if race=="white" & county=="ful > ton" & births__total==6063 (1 real change made) . replace city_balance_total="atlanta (total)" if race=="nonwhite" & county==" > fulton" & births__total==3102 (1 real change made) . replace city_balance_total="east point" if race=="white" & county=="fulton" > & births__total==434 (1 real change made) . replace city_balance_total="east point" if race=="nonwhite" & county=="fulto > n" & births__total==75 (1 real change made) . replace city_balance_total="balance" if race=="white" & county=="fulton" & b > irths__total==3379 (1 real change made) . replace city_balance_total="balance" if race=="nonwhite" & county=="fulton" > & births__total==639 (1 real change made) . replace city_balance_total="brunswick" if race=="white" & county=="glynn" & > births__total==415 (1 real change made) . replace city_balance_total="brunswick" if race=="nonwhite" & county=="glynn" > & births__total==217 (1 real change made) . replace city_balance_total="balance" if race=="white" & county=="glynn" & bi > rths__total==158 (1 real change made) . replace city_balance_total="balance" if race=="nonwhite" & county=="glynn" & > births__total==50 (1 real change made) . replace city_balance_total="gainesville" if race=="white" & county=="hall" & > births__total==302 (1 real change made) . replace city_balance_total="gainesville" if race=="nonwhite" & county=="hall > " & births__total==95 (1 real change made) . replace city_balance_total="valdosta" if race=="white" & county=="lowndos" & > births__total==369 (1 real change made) . replace city_balance_total="valdosta" if race=="nonwhite" & county=="lowndos > " & births__total==217 (1 real change made) . replace city_balance_total="balance" if race=="white" & county=="lowndos" & > births__total==254 (1 real change made) . replace city_balance_total="balance" if race=="nonwhite" & county=="lowndos" > & births__total==202 (1 real change made) . replace city_balance_total="elgin (part)" if race=="total" & county=="kane" > & births__total==820 (1 real change made) . replace city_balance_total="elgin (total)" if race=="total" & county=="kane" > & births__total==832 (1 real change made) . replace city_balance_total="centralia (part)" if race=="total" & county=="ma > rion" & births__total==323 (1 real change made) . replace city_balance_total="centralia (part)" if race=="total" & county=="cl > inton" & births__total==6 (1 real change made) . replace city_balance_total="centralia (total)" if race=="total" & county=="m > arion" & births__total==329 (1 real change made) . replace city_balance_total="monroe" if race=="white" & county=="ouachita" & > births__total==589 (1 real change made) . replace city_balance_total="monroe" if race=="nonwhite" & county=="ouachita" > & births__total==378 (1 real change made) . replace city_balance_total="balance" if race=="white" & county=="ouachita" & > births__total==615 (1 real change made) . replace city_balance_total="balance" if race=="nonwhite" & county=="ouachita > " & births__total==303 (1 real change made) . replace city_balance_total="alexandria" if race=="white" & county=="rapides" > & births__total==663 (1 real change made) . replace city_balance_total="alexandria" if race=="nonwhite" & county=="rapid > es" & births__total==424 (1 real change made) . replace city_balance_total="balance" if race=="white" & county=="rapides" & > births__total==1041 (1 real change made) . replace city_balance_total="balance" if race=="nonwhite" & county=="rapides" > & births__total==405 (1 real change made) . replace city_balance_total="st. cloud (part)" if race=="total" & county=="sh > erburne" & births__total==37 (1 real change made) . replace city_balance_total="st. cloud (part)" if race=="total" & county=="st > earns" & births__total==727 (1 real change made) . replace city_balance_total="st. cloud (part)" if race=="total" & county=="be > nton" & births__total==80 (1 real change made) . replace city_balance_total="st. cloud (total)" if race=="total" & county=="s > tearns" & births__total==844 (1 real change made) . replace city_balance_total="new rochelle" if race=="white" & county=="westch > ester" & births__total==1070 (1 real change made) . replace city_balance_total="new rochelle" if race=="nonwhite" & county=="wes > tchester" & births__total==170 (1 real change made) . replace city_balance_total="ossining" if race=="white" & county=="westcheste > r" & births__total==283 (1 real change made) . replace city_balance_total="ossining" if race=="nonwhite" & county=="westche > ster" & births__total==28 (1 real change made) . replace city_balance_total="galveston" if race=="white" & county=="galveston > " & births__total==1524 (1 real change made) . replace city_balance_total="galveston" if race=="nonwhite" & county=="galves > ton" & births__total==503 (1 real change made) . replace city_balance_total="denison" if race=="white" & county=="grayson" & > births__total==501 (1 real change made) . replace city_balance_total="denison" if race=="nonwhite" & county=="grayson" > & births__total==73 (1 real change made) . replace city_balance_total="sherman" if race=="white" & county=="grayson" & > births__total==450 (1 real change made) . replace city_balance_total="sherman" if race=="nonwhite" & county=="grayson" > & births__total==48 (1 real change made) . replace city_balance_total="longview" if race=="white" & county=="gregg" & b > irths__total==521 (1 real change made) . replace city_balance_total="longview" if race=="nonwhite" & county=="gregg" > & births__total==126 (1 real change made) . replace city_balance_total="salinas" if race=="white" & county=="monterey" & > births__total==1042 (1 real change made) . replace city_balance_total="salinas" if race=="nonwhite" & county=="monterey > " & births__total==67 (1 real change made) . replace city_balance_total="balance" if race=="white" & county=="sacramento" > & births__total==2838 (1 real change made) . replace city_balance_total="balance" if race=="nonwhite" & county=="sacramen > to" & births__total==207 (1 real change made) . replace city_balance_total="balance" if race=="white" & county=="san joaquin > " & births__total==2713 (1 real change made) . replace city_balance_total="balance" if race=="nonwhite" & county=="san joaq > uin" & births__total==257 (1 real change made) . replace city_balance_total="pleasantville" if race=="white" & county=="atlan > tic" & births__total==253 & city_balance_total=="do" (1 real change made) . replace city_balance_total="pleasantville" if race=="nonwhite" & county=="at > lantic" & births__total==47 & city_balance_total=="do" (1 real change made) . replace city_balance_total="hackensack" if race=="white" & county=="bergen" > & births__total==535 & city_balance_total=="do" (1 real change made) . replace city_balance_total="hackensack" if race=="nonwhite" & county=="berge > n" & births__total==63 & city_balance_total=="do" (1 real change made) . replace city_balance_total="greenville" if race=="white" & county=="pitt" & > births__total==217 (1 real change made) . replace city_balance_total="greenville" if race=="nonwhite" & county=="pitt" > & births__total==157 (1 real change made) . replace city_balance_total="balance" if city_balance_total=="balence" (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="total" & births__total==4442 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="white" & births__total==3000 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="nonwhite" & births__total==1442 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="total" & births__total==5771 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="white" & births__total==3985 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="nonwhite" & births__total==1786 (1 real change made) . . *fix county name misspellings (relative to county names in the 1970 census) . replace county=subinstr(county,"st. ","st ",.) (90 real changes made) . replace county=subinstr(county,"ste. ", "st ",.) (1 real change made) . replace county="monroe" if county=="monros" & state=="alabama" (3 real changes made) . replace county="lassen" if county=="lasson" & state=="california" (1 real change made) . replace county="mendocino" if county=="mandocino" & state=="california" (1 real change made) . replace county="orange" if county=="orango" & state=="california" (5 real changes made) . replace county="san francisco" if county=="san francisco, coextensive with s > an francisco" & state=="california" (3 real changes made) . replace county="san luis obispo" if county=="san luis obiapo" & state=="cali > fornia" (1 real change made) . replace county="siskiyou" if county=="siekiyou" & state=="california" (1 real change made) . replace county="stanislaus" if county=="stenislaus" & state=="california" (3 real changes made) . replace county="baca" if county=="beca" & state=="colorado" (1 real change made) . replace county="denver" if county=="denver, coertensive with denver" & state > =="colorado" (1 real change made) . replace county="hinsdale" if county=="hinodale" & state=="colorado" (1 real change made) . replace county="yuma" if county=="yume" & state=="colorado" (1 real change made) . replace county="sussex" if county=="susser" & state=="delaware" (3 real changes made) . replace county="citrus" if county=="citrue" & state=="florida" (3 real changes made) . replace county="desoto" if county=="de soto" & state=="florida" (3 real changes made) . replace county="escambia" if county=="escembia" & state=="florida" (9 real changes made) . replace county="gadsden" if county=="gadaden" & state=="florida" (3 real changes made) . replace county="gilchrist" if county=="gilchriet" & state=="florida" (3 real changes made) . replace county="santa rosa" if county=="santa roan" & state=="florida" (3 real changes made) . replace county="sarasota" if county=="sarascta" & state=="florida" (9 real changes made) . replace county="dawson" if county=="davson" & state=="georgia" (1 real change made) . replace county="jenkins" if county=="jenkine" & state=="georgia" (3 real changes made) . replace county="lowndes" if county=="lowndos" & state=="georgia" (9 real changes made) . replace county="lumpkin" if county=="lampkin" & state=="georgia" (1 real change made) . replace county="meriwether" if county=="merivether" & state=="georgia" (3 real changes made) . replace county="putnam" if county=="putnem" & state=="georgia" (3 real changes made) . replace county="franklin" if county=="frenklin" & state=="idaho" (1 real change made) . replace county="cass" if county=="case" & state=="illinois" (1 real change made) . replace county="de witt" if county=="de hitt" & state=="illinois" (1 real change made) . replace county="du page" if county=="du pags" & state=="illinois" (3 real changes made) . replace county="gallatin" if county=="gallstin" & state=="illinois" (1 real change made) . replace county="lawrence" if county=="lewrence" & state=="illinois" (1 real change made) . replace county="mclean" if county=="mcleen" & state=="illinois" (3 real changes made) . replace county="grant" if county=="grent" & state=="indiana" (3 real changes made) . replace county="buchanan" if county=="buchanen" & state=="iowa" (1 real change made) . replace county="chickasaw" if county=="chickasew" & state=="iowa" (1 real change made) . replace county="hamilton" if county=="hemilton" & state=="iowa" (1 real change made) . replace county="jasper" if county=="jeaper" & state=="iowa" (3 real changes made) . replace county="marion" if county=="marison" & state=="iowa" (1 real change made) . replace county="osceola" if county=="oscaola" & state=="iowa" (1 real change made) . replace county="wabaunsee" if county=="webaunsee" & state=="kansas" (1 real change made) . replace county="marshall" if county=="marshell" & state=="kentucky" (1 real change made) . replace county="oldham" if county=="oldhem" & state=="kentucky" (3 real changes made) . replace county="owsley" if county=="owaley" & state=="kentucky" (1 real change made) . replace county="claiborne" if county=="claihorne" & state=="louisiana" (3 real changes made) . replace county="orleans" if county=="orleans, coertensive with new orleans" > & state=="louisiana" (3 real changes made) . replace county="st helena" if county=="st helema" & state=="louisiana" (3 real changes made) . replace county="webster" if county=="webater" & state=="louisiana" (3 real changes made) . replace county="west feliciana" if county=="west feliciane" & state=="louisi > ana" (3 real changes made) . replace county="baltimore city" if county=="baltimore (city)" & state=="mary > land" (3 real changes made) . replace county="ingham" if county=="inghem" & state=="michigan" (3 real changes made) . replace county="manistee" if county=="menistee" & state=="michigan" (1 real change made) . replace county="marquette" if county=="merquette" & state=="michigan" (3 real changes made) . replace county="midland" if county=="midlend" & state=="michigan" (3 real changes made) . replace county="missaukee" if county=="missankee" & state=="michigan" (1 real change made) . replace county="newaygo" if county=="newaygb" & state=="michigan" (1 real change made) . replace county="ontonagon" if county=="ontanagon" & state=="michigan" (1 real change made) . replace county="osceola" if county=="ouceola" & state=="michigan" (1 real change made) . replace county="chisago" if county=="chicago" & state=="minnesota" (1 real change made) . replace county="isanti" if county=="isenti" & state=="minnesota" (1 real change made) . replace county="jackson" if county=="jeckson" & state=="minnesota" (1 real change made) . replace county="pipestone" if county=="pipeotone" & state=="minnesota" (1 real change made) . replace county="st louis" if county=="st loufa" & state=="minnesota" (5 real changes made) . replace county="traverse" if county=="traveroe" & state=="minnesota" (1 real change made) . replace county="wabasha" if county=="weheoha" & state=="minnesota" (1 real change made) . replace county="wadena" if county=="wedena" & state=="minnesota" (1 real change made) . replace county="waseca" if county=="weseca" & state=="minnesota" (1 real change made) . replace county="noxubee" if county=="norubee" & state=="mississippi" (3 real changes made) . replace county="st louis city" if county=="st louis (city)" & state=="missou > ri" (3 real changes made) . replace county="deer lodge" if county=="deer lode" & state=="montana" (3 real changes made) . replace county="jefferson" if county=="kefferson" & state=="montana" (1 real change made) . replace county="holt" if county=="halt" & state=="nebraska" (1 real change made) . replace county="knox" if county=="knor" & state=="nebraska" (1 real change made) . replace county="morrill" if county=="norrill" & state=="nebraska" (1 real change made) . replace county="pawnee" if county=="paunee" & state=="nebraska" (1 real change made) . replace county="webster" if county=="webater" & state=="nebraska" (1 real change made) . replace county="lea" if county=="loa" & state=="new mexico" (3 real changes made) . replace county="chenango" if county=="chemango" & state=="new york" (1 real change made) . replace county="forsyth" if county=="foreyth" & state=="north carolina" (9 real changes made) . replace county="gates" if county=="getes" & state=="north carolina" (3 real changes made) . replace county="greene" if county=="greens" & state=="north carolina" (3 real changes made) . replace county="halifax" if county=="halifar" & state=="north carolina" (3 real changes made) . replace county="hyde" if county=="hydo" & state=="north carolina" (3 real changes made) . replace county="onslow" if county=="onslov" & state=="north carolina" (3 real changes made) . replace county="randolph" if county=="rendolph" & state=="north carolina" (1 real change made) . replace county="swain" if county=="svain" & state=="north carolina" (3 real changes made) . replace county="transylvania" if county=="trannylvania" & state=="north caro > lina" (1 real change made) . replace county="cavalier" if county=="cavaliver" & state=="north dakota" (1 real change made) . replace county="bottineau" if county=="bottineeu" & state=="north dakota" (1 real change made) . replace county="adams" if county=="adame" & state=="ohio" (1 real change made) . replace county="clark" if county=="clarke" & state=="ohio" (5 real changes made) . replace county="ross" if county=="roso" & state=="ohio" (3 real changes made) . replace county="mcintosh" if county=="mcintoch" & state=="oklahoma" (3 real changes made) . replace county="pushmataha" if county=="puahmataha" & state=="oklahoma" (1 real change made) . replace county="linn" if county=="lim" & state=="oregon" (1 real change made) . replace county="marion" if county=="narion" & state=="oregon" (3 real changes made) . replace county="multnomah" if county=="multnomeh" & state=="oregon" (3 real changes made) . replace county="sherman" if county=="shermen" & state=="oregon" (1 real change made) . replace county="umatilla" if county=="umatille" & state=="oregon" (1 real change made) . replace county="wheeler" if county=="whecler" & state=="oregon" (1 real change made) . replace county="adams" if county=="adame" & state=="pennsylvania" (1 real change made) . replace county="chester" if county=="chenter" & state=="pennsylvania" (9 real changes made) . replace county="greene" if county=="greens" & state=="pennsylvania" (1 real change made) . replace county="indiana" if county=="indiena" & state=="pennsylvania" (3 real changes made) . replace county="philadelphia" if county=="philadelphia, coextensive with phi > ladelphia (city)" & state=="pennsylvania" (3 real changes made) . replace county="lancaster" if county=="lanceater" & state=="south carolina" (3 real changes made) . replace county="lexington" if county=="lerington" & state=="south carolina" (3 real changes made) . replace county="mccormick" if county=="mccomaick" & state=="south carolina" (3 real changes made) . replace county="oconee" if county=="oconoe" & state=="south carolina" (3 real changes made) . replace county="haakon" if county=="heakon" & state=="south dakota" (1 real change made) . replace county="lyman" if county=="lymen" & state=="south dakota" (1 real change made) . replace county="mcpherson" if county=="mcphernon" & state=="south dakota" (1 real change made) . replace county="mellette" if county=="melletts" & state=="south dakota" (3 real changes made) . replace county="minnehaha" if county=="minnehahe" & state=="south dakota" (3 real changes made) . replace county="decatur" if county=="decetur" & state=="tennessee" (1 real change made) . replace county="fentress" if county=="fentreas" & state=="tennessee" (1 real change made) . replace county="hamblen" if county=="hemblen" & state=="tennessee" (1 real change made) . replace county="sumner" if county=="summer" & state=="tennessee" (3 real changes made) . replace county="cass" if county=="cano" & state=="texas" (3 real changes made) . replace county="cochrane" if county=="cochran" & state=="texas" (1 real change made) . replace county="gillespie" if county=="gilleapie" & state=="texas" (1 real change made) . replace county="grimes" if county=="crimes" & state=="texas" (3 real changes made) . replace county="hutchinson" if county=="butchinson" & state=="texas" (3 real changes made) . replace county="kendall" if county=="kendell" & state=="texas" (1 real change made) . replace county="nacogdoches" if county=="nacogdochee" & state=="texas" (3 real changes made) . replace county="rains" if county=="raina" & state=="texas" (1 real change made) . replace county="randall" if county=="randell" & state=="texas" (3 real changes made) . replace county="roberts" if county=="roberto" & state=="texas" (1 real change made) . replace county="pleasants" if county=="pleasante" & state=="west virginia" (1 real change made) . replace county="raleigh" if county=="releigh" & state=="west virginia" (9 real changes made) . replace county="bayfield" if county=="beyfield" & state=="wisconsin" (1 real change made) . replace county="eau claire" if county=="eau clairs" & state=="wisconsin" (3 real changes made) . replace county="albany" if county=="albeny" & state=="wyoming" (3 real changes made) . replace county="campbell" if county=="compbell" & state=="wyoming" (1 real change made) . replace county="laramie" if county=="laremie" & state=="wyoming" (3 real changes made) . . *correct data entry errors found while checking that county totals sum to st > ate totals . replace births__total=1367 if births__total==1357 & state=="al > abama" & county=="marshall" & city_balance_total=="t > otal" & race=="total" (1 real change made) . replace births__total=8468 if births__total==8469 & state=="ca > lifornia" & county=="contra costa" & city_balance_total=="total" & > race=="total" (1 real change made) . replace births__total=780 if births__total==790 & state=="co > lorado" & county=="adams" & city_balance_t > otal=="total" & race=="total" (1 real change made) . replace births__total=168 if births__total==166 & state=="ge > orgia" & county=="charlton" & city_balance_total=="t > otal" & race=="total" (1 real change made) . replace births__total=261 if births__total==251 & state=="ke > ntucky" & county=="larue" & city_balance_t > otal=="total" & race=="total" (1 real change made) . replace births_attended_by_physician_in_=458 if births_attended_by_physic > ian_in_==459 & state=="illinois" & county=="shelby" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_attended_by_physician_in_=281 if births_attended_by_physic > ian_in_==291 & state=="iowa" & county=="shelby" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_attended_by_physician_in_=187 if births_attended_by_physic > ian_in_==197 & state=="iowa" & county=="van buren" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_attended_by_physician_in_=81 if births_attended_by_physic > ian_in_==91 & state=="kentucky" & county=="carlisle" & city_b > alance_total=="total" & race=="total" (1 real change made) . replace births_attended_by_physician_in_=138 if births_attended_by_physic > ian_in_==139 & state=="kentucky" & county=="crittenden" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_attended_by_physician_in_=228 if births_attended_by_physic > ian_in_==229 & state=="louisiana" & county=="bienville" & city_b > alance_total=="total" & race=="total" (1 real change made) . replace births_attended_by_physician_in_=143 if births_attended_by_physic > ian_in_==145 & state=="nebraska" & county=="johnson" & city_b > alance_total=="total" & race=="total" (1 real change made) . replace births_attended_by_physician_in_=500 if births_attended_by_physic > ian_in_==600 & state=="tennessee" & county=="hamblen" & city_b > alance_total=="total" & race=="total" (1 real change made) . . *correct data entry errors found while checking that white+nonwhite=total . replace births__total=108 if births__total==106 & state=="georgia" & co > unty=="charlton" & race=="white" & city_balance_total=="total" (1 real change made) . replace births_attended_by_physician_in_=68 if births_attended_by_physician_ > in_==69 & state=="louisiana" & county=="bienville" & race=="nonwhit > e" & city_balance_total=="total" (1 real change made) . . *check that county names are consistent with 1970 census, except for known d > eviations . preserve . egen countyr_state=concat(county state), punct("_") . sort countyr_state . merge countyr_state using original_census1970_counties variable countyr_state does not uniquely identify observations in the master data variable countyr_state does not uniquely identify observations in original_census1970_counties.dta . assert _m==3|county=="armstrong" & state=="south dakota"|county=="washington > " & state=="south dakota"|county=="yellowstone national park (part)" & state > =="idaho"|county=="yellowstone national park" & state=="idaho"|county=="yell > owstone national park (part)" & state=="montana"|county=="yellowstone nation > al park (part)" & state=="wyoming"|county=="yellowstone national park (total > )" & state=="wyoming"|county=="yellowstone national park" & state=="montana" > |county=="park (excl yell nat park)" & state=="montana"|county=="yellowstone > national park, part" & state=="montana"|county=="yellowstone national park, > part" & state=="wyoming"|county=="yellowstone national park, part" & state= > ="montana"|county=="yellowstone national park, total" & state=="wyoming"|cou > nty=="park" & state=="montana"|county=="new york city" & state=="new york"|c > ounty=="bronx" & state=="new york"|county=="kings" & state=="new york"|count > y=="new york" & state=="new york"|county=="queens" & state=="new york"|count > y=="richmond" & state=="new york"|county=="ormsby" & state=="nevada"|county= > ="carson city" & state=="nevada"|county=="los alamos" & state=="new mexico"| > county=="menominee" & state=="wisconsin"|county=="total"|state=="alaska"|sta > te=="hawaii"|state=="virginia"|state=="dc" . restore . . *check for mistakes/misspellings in state names . assert state=="alabama"|state=="alaska"|state=="arizona"|state=="arkansas"|s > tate=="california"|state=="colorado"|state=="connecticut"|state=="delaware"| > state=="district of columbia"|state=="florida"|state=="georgia"|state=="hawa > ii"|state=="idaho"|state=="illinois"|state=="indiana"|state=="iowa"|state==" > kansas"|state=="kentucky"|state=="louisiana"|state=="maine"|state=="maryland > "|state=="massachusetts"|state=="michigan"|state=="minnesota"|state=="missis > sippi"|state=="missouri"|state=="montana"|state=="nebraska"|state=="nevada"| > state=="new hampshire"|state=="new jersey"|state=="new mexico"|state=="new y > ork"|state=="north carolina"|state=="north dakota"|state=="ohio"|state=="okl > ahoma"|state=="oregon"|state=="pennsylvania"|state=="rhode island"|state=="s > outh carolina"|state=="south dakota"|state=="tennessee"|state=="texas"|state > =="utah"|state=="vermont"|state=="virginia"|state=="washington"|state=="west > virginia"|state=="wisconsin"|state=="wyoming" . . *data checks for columns summing to county total . *not applicable; see footnote 1 . . *clean and label variables . rename page__of_pdf_ page_of_pdf . label var page_of_pdf "page of pdf" . label var state "state" . label var county "county" . rename city_balance_total sub_county . label var sub_county "city/balance/total" . label var race "race" . rename births__total births . label var births "births by residence" . rename births_attended_by_physician_in_ births_h_p . label var births_h "births by residence: physician in hospital" . rename births_attended_by_physician_not births_nh_p . label var births_nh_p "births by residence: physician not in hospital" . rename births_attended_by_midwife births_m . label var births_m "births by residence: midwife" . . *generate year variable . gen year=1947 . label var year "year" . . *check that observations are unique . egen tag=tag(state county sub_county race) . assert tag==1 . drop tag . . list in 1/1 +--------------------------------------------------------------------+ 1. | page_o~f | state | county | sub_co~y | race | births | bir~_h_p | | 2 | alabama | total | total | total | 88116 | 47817 | |--------------------------------------------------------------------| | bir~nh_p | births_m | year | | 21387 | 18725 | 1947 | +--------------------------------------------------------------------+ . desc Contains data from natality1947.dta obs: 7,609 vars: 10 size: 981,561 (90.6% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page_of_pdf byte %8.0g page of pdf state str24 %24s state county str50 %50s county sub_county str26 %26s city/balance/total race str8 %8s race births long %12.0g births by residence births_h_p long %12.0g births by residence: physician in hospital births_nh_p int %8.0g births by residence: physician not in hospital births_m int %8.0g births by residence: midwife year float %9.0g year ------------------------------------------------------------------------------ > - Sorted by: Note: dataset has changed since last saved . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page_of_pdf | 7609 47.5964 26.42657 2 93 state | 0 county | 0 sub_county | 0 race | 0 -------------+-------------------------------------------------------- births | 7609 2030.158 11789.49 0 323250 births_h_p | 7609 1728.561 10774.44 0 315008 births_nh_p | 7609 193.8549 1190.538 0 29837 births_m | 7609 101.5746 817.7253 0 25675 year | 7609 1947 0 1947 1947 . saveold clean_natality1947.dta, replace file clean_natality1947.dta saved . clear . . ** . *1948 data . ** . *http://nber15.nber.org/vital-stats-books/vsus_1948_2.cv.pdf . *table 1 . . *hand-checked excel file for ?'s (difficult-to-read characters) . *corrected below if any . . clear . use natality1948.dta . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page__of_p~_ | 7611 47.34227 26.56087 2 93 state | 0 county | 0 city_balan~l | 0 race | 0 -------------+-------------------------------------------------------- births__to~l | 7522 1947.583 11189.22 1 301966 births_of_~t | 7521 1673.403 10264.68 1 295315 births_of_~0 | 7315 173.5858 1050.214 1 26067 births_of_~1 | 4673 146.0717 912.5144 1 21168 . desc Contains data from natality1948.dta obs: 7,611 vars: 9 size: 936,153 (91.1% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page__of_pdf_ byte %8.0g state str20 %20s county str52 %52s city_balance_~l str26 %26s race str8 %8s births__total long %12.0g births_of_res~t long %12.0g births_of_res~0 int %8.0g births_of_res~1 int %8.0g ------------------------------------------------------------------------------ > - Sorted by: . replace county=lower(county) (7593 real changes made) . replace state=lower(state) (7611 real changes made) . replace city_balance_total=lower(city_balance_total) (7593 real changes made) . replace race=lower(race) (7593 real changes made) . . *missing observations in the original data are zeros . *replace these so we can distinguish these from "true" missing values . *which would be generated below due to difficult-to-read characters . count if births__total==. 89 . replace births__total=0 if births__total ==. (89 real changes made) . count if births_of_residents_of_area__att==. 90 . replace births_of_residents_of_area__att =0 if births_of_residents_of_area > __att ==. (90 real changes made) . count if births_of_residents_of_area__at0==. 296 . replace births_of_residents_of_area__at0 =0 if births_of_residents_of_area > __at0 ==. (296 real changes made) . count if births_of_residents_of_area__at1==. 2938 . replace births_of_residents_of_area__at1 =0 if births_of_residents_of_area > __at1 ==. (2938 real changes made) . . *check that all pdf pages appear to be in the data . sort page__of_pdf_ . gen temp=page__of_pdf_[_n]-page__of_pdf_[_n-1] (1 missing value generated) . assert temp==0|temp==1|temp==. . drop temp . . *clean data entry errors . replace county="oscoda" if state=="michigan" & county=="osceola" & births__t > otal==64 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="total" & births__total==4234 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="white" & births__total==2645 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="nonwhite" & births__total==1589 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="total" & births__total==5415 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="white" & births__total==3425 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="nonwhite" & births__total==1990 (1 real change made) . replace county="roanoke (ind. city)" if county=="roanoke" & state=="virginia > " & race=="total" & births__total==1816 (1 real change made) . replace births__total=1241 if state=="ohio" & county=="lawrence" & city_bala > nce_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=1102 if state=="ohio" & county=="la > wrence" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=125 if state=="ohio" & county=="law > rence" & city_balance_total=="total" & race=="total" (1 real change made) . replace births__total=364 if state=="ohio" & county=="lawrence" & city_balan > ce_total=="ironton" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=352 if state=="ohio" & county=="law > rence" & city_balance_total=="ironton" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=12 if state=="ohio" & county=="lawr > ence" & city_balance_total=="ironton" & race=="total" (1 real change made) . replace births__total=877 if state=="ohio" & county=="lawrence" & city_balan > ce_total=="balance of county" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=750 if state=="ohio" & county=="law > rence" & city_balance_total=="balance of county" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=113 if state=="ohio" & county=="law > rence" & city_balance_total=="balance of county" & race=="total" (1 real change made) . replace births__total=1508 if state=="ohio" & county=="licking" & city_balan > ce_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=1442 if state=="ohio" & county=="li > cking" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=66 if state=="ohio" & county=="lick > ing" & city_balance_total=="total" & race=="total" (1 real change made) . replace births__total=857 if state=="ohio" & county=="licking" & city_balanc > e_total=="newark" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=840 if state=="ohio" & county=="lic > king" & city_balance_total=="newark" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=17 if state=="ohio" & county=="lick > ing" & city_balance_total=="newark" & race=="total" (1 real change made) . replace births__total=651 if state=="ohio" & county=="licking" & city_balanc > e_total=="balance of county" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=602 if state=="ohio" & county=="lic > king" & city_balance_total=="balance of county" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=49 if state=="ohio" & county=="lick > ing" & city_balance_total=="balance of county" & race=="total" (1 real change made) . replace births__total=680 if state=="ohio" & county=="logan" & city_balance_ > total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=640 if state=="ohio" & county=="log > an" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=40 if state=="ohio" & county=="loga > n" & city_balance_total=="total" & race=="total" (1 real change made) . replace births__total=3691 if state=="ohio" & county=="lorain" & city_balanc > e_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=3653 if state=="ohio" & county=="lo > rain" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=35 if state=="ohio" & county=="lora > in" & city_balance_total=="total" & race=="total" (1 real change made) . replace births__total=816 if state=="ohio" & county=="lorain" & city_balance > _total=="elyria" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=807 if state=="ohio" & county=="lor > ain" & city_balance_total=="elyria" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=7 if state=="ohio" & county=="lorai > n" & city_balance_total=="elyria" & race=="total" (1 real change made) . replace births__total=1432 if state=="ohio" & county=="lorain" & city_balanc > e_total=="lorain" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=1422 if state=="ohio" & county=="lo > rain" & city_balance_total=="lorain" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=10 if state=="ohio" & county=="lora > in" & city_balance_total=="lorain" & race=="total" (1 real change made) . replace births__total=1443 if state=="ohio" & county=="lorain" & city_balanc > e_total=="balance of county" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=1424 if state=="ohio" & county=="lo > rain" & city_balance_total=="balance of county" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=18 if state=="ohio" & county=="lora > in" & city_balance_total=="balance of county" & race=="total" (1 real change made) . replace births__total=9250 if state=="ohio" & county=="lucas" & city_balance > _total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=8998 if state=="ohio" & county=="lu > cas" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=239 if state=="ohio" & county=="luc > as" & city_balance_total=="total" & race=="total" (1 real change made) . replace births__total=8027 if state=="ohio" & county=="lucas" & city_balance > _total=="toledo" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=7865 if state=="ohio" & county=="lu > cas" & city_balance_total=="toledo" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=154 if state=="ohio" & county=="luc > as" & city_balance_total=="toledo" & race=="total" (1 real change made) . replace births_of_residents_of_area__at1=2 if state=="ohio" & county=="lucas > " & city_balance_total=="toledo" & race=="total" (1 real change made) . replace births__total=7325 if state=="ohio" & county=="lucas" & city_balance > _total=="toledo" & race=="white" (1 real change made) . replace births_of_residents_of_area__att=7199 if state=="ohio" & county=="lu > cas" & city_balance_total=="toledo" & race=="white" (1 real change made) . replace births_of_residents_of_area__at0=118 if state=="ohio" & county=="luc > as" & city_balance_total=="toledo" & race=="white" (1 real change made) . replace births_of_residents_of_area__at1=2 if state=="ohio" & county=="lucas > " & city_balance_total=="toledo" & race=="white" (1 real change made) . replace births__total=702 if state=="ohio" & county=="lucas" & city_balance_ > total=="toledo" & race=="nonwhite" (1 real change made) . replace births_of_residents_of_area__att=666 if state=="ohio" & county=="luc > as" & city_balance_total=="toledo" & race=="nonwhite" (1 real change made) . replace births_of_residents_of_area__at0=36 if state=="ohio" & county=="luca > s" & city_balance_total=="toledo" & race=="nonwhite" (1 real change made) . replace births__total=1223 if state=="ohio" & county=="lucas" & city_balance > _total=="balance of county" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=1133 if state=="ohio" & county=="lu > cas" & city_balance_total=="balance of county" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=85 if state=="ohio" & county=="luca > s" & city_balance_total=="balance of county" & race=="total" (1 real change made) . replace births__total=549 if state=="ohio" & county=="madison" & city_balanc > e_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=343 if state=="ohio" & county=="mad > ison" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=206 if state=="ohio" & county=="mad > ison" & city_balance_total=="total" & race=="total" (1 real change made) . replace births__total=5894 if state=="ohio" & county=="mahoning" & city_bala > nce_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=5780 if state=="ohio" & county=="ma > honing" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=106 if state=="ohio" & county=="mah > oning" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=4255 if state=="ohio" & county=="ma > honing" & city_balance_total=="youngstown (part)" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=67 if state=="ohio" & county=="maho > ning" & city_balance_total=="youngstown (part)" & race=="total" (1 real change made) . replace births__total=3772 if state=="ohio" & county=="mahoning" & city_bala > nce_total=="youngstown (part)" & race=="white" (1 real change made) . replace births_of_residents_of_area__att=3756 if state=="ohio" & county=="ma > honing" & city_balance_total=="youngstown (part)" & race=="white" (1 real change made) . replace births_of_residents_of_area__at1=2 if state=="ohio" & county=="mahon > ing" & city_balance_total=="youngstown (part)" & race=="white" (1 real change made) . replace births_of_residents_of_area__att=499 if state=="ohio" & county=="mah > oning" & city_balance_total=="youngstown (part)" & race=="nonwhite" (1 real change made) . replace births_of_residents_of_area__at0=55 if state=="ohio" & county=="maho > ning" & city_balance_total=="youngstown (part)" & race=="nonwhite" (1 real change made) . replace births_of_residents_of_area__att=4272 if state=="ohio" & county=="ma > honing" & city_balance_total=="youngstown (total)" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=67 if state=="ohio" & county=="maho > ning" & city_balance_total=="youngstown (total)" & race=="total" (1 real change made) . replace births__total=3789 if state=="ohio" & county=="mahoning" & city_bala > nce_total=="youngstown (total)" & race=="white" (1 real change made) . replace births_of_residents_of_area__att=3773 if state=="ohio" & county=="ma > honing" & city_balance_total=="youngstown (total)" & race=="white" (1 real change made) . replace births_of_residents_of_area__at0=12 if state=="ohio" & county=="maho > ning" & city_balance_total=="youngstown (total)" & race=="white" (1 real change made) . replace births_of_residents_of_area__at1=2 if state=="ohio" & county=="mahon > ing" & city_balance_total=="youngstown (total)" & race=="white" (1 real change made) . replace births__total=555 if state=="ohio" & county=="mahoning" & city_balan > ce_total=="youngstown (total)" & race=="nonwhite" (1 real change made) . replace births_of_residents_of_area__att=499 if state=="ohio" & county=="mah > oning" & city_balance_total=="youngstown (total)" & race=="nonwhite" (1 real change made) . replace births_of_residents_of_area__at0=55 if state=="ohio" & county=="maho > ning" & city_balance_total=="youngstown (total)" & race=="nonwhite" (1 real change made) . replace births__total=271 if state=="ohio" & county=="mahoning" & city_balan > ce_total=="campbell" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=261 if state=="ohio" & county=="mah > oning" & city_balance_total=="campbell" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=8 if state=="ohio" & county=="mahon > ing" & city_balance_total=="campbell" & race=="total" (1 real change made) . replace births__total=239 if state=="ohio" & county=="mahoning" & city_balan > ce_total=="campbell" & race=="white" (1 real change made) . replace births_of_residents_of_area__att=235 if state=="ohio" & county=="mah > oning" & city_balance_total=="campbell" & race=="white" (1 real change made) . replace births__total=32 if state=="ohio" & county=="mahoning" & city_balanc > e_total=="campbell" & race=="nonwhite" (1 real change made) . replace births_of_residents_of_area__at0=5 if state=="ohio" & county=="mahon > ing" & city_balance_total=="campbell" & race=="nonwhite" (1 real change made) . replace births__total=280 if state=="ohio" & county=="mahoning" & city_balan > ce_total=="struthers" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=5 if state=="ohio" & county=="mahon > ing" & city_balance_total=="struthers" & race=="total" (1 real change made) . replace births__total=1016 if state=="ohio" & county=="mahoning" & city_bala > nce_total=="balance of county" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=989 if state=="ohio" & county=="mah > oning" & city_balance_total=="balance of county" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=26 if state=="ohio" & county=="maho > ning" & city_balance_total=="balance of county" & race=="total" (1 real change made) . replace births__total=1213 if state=="ohio" & county=="marion" & city_balanc > e_total=="total" & race=="total" (1 real change made) . replace births__total=879 if state=="ohio" & county=="marion" & city_balance > _total=="marion" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=271 if state=="ohio" & county=="mei > gs" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=186 if state=="ohio" & county=="mei > gs" & city_balance_total=="total" & race=="total" (1 real change made) . replace births__total=720 if state=="ohio" & county=="mercer" & city_balance > _total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=710 if state=="ohio" & county=="mer > cer" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=8 if state=="ohio" & county=="merce > r" & city_balance_total=="total" & race=="total" (1 real change made) . replace births__total=1530 if state=="ohio" & county=="miami" & city_balance > _total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=149 if state=="ohio" & county=="mon > roe" & city_balance_total=="total" & race=="total" (1 real change made) . replace births__total=10606 if state=="ohio" & county=="montgomery" & city_b > alance_total=="total" & race=="total" (1 real change made) . replace births__total=9009 if state=="ohio" & county=="montgomery" & city_ba > lance_total=="dayton" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=8901 if state=="ohio" & county=="mo > ntgomery" & city_balance_total=="dayton" & race=="total" (1 real change made) . replace births__total=7954 if state=="ohio" & county=="montgomery" & city_ba > lance_total=="dayton" & race=="white" (1 real change made) . replace births_of_residents_of_area__att=7896 if state=="ohio" & county=="mo > ntgomery" & city_balance_total=="dayton" & race=="white" (1 real change made) . replace births__total=1055 if state=="ohio" & county=="montgomery" & city_ba > lance_total=="dayton" & race=="nonwhite" (1 real change made) . replace births__total=244 if state=="ohio" & county=="morgan" & city_balance > _total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=165 if state=="ohio" & county=="mor > gan" & city_balance_total=="total" & race=="total" (1 real change made) . replace births__total=309 if state=="ohio" & county=="morrow" & city_balance > _total=="total" & race=="total" (1 real change made) . replace births__total=1053 if state=="ohio" & county=="muskingum" & city_bal > ance_total=="zanesville" & race=="total" (1 real change made) . replace births__total=693 if state=="ohio" & county=="muskingum" & city_bala > nce_total=="balance of county" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=229 if state=="ohio" & county=="pik > e" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=74 if state=="ohio" & county=="rich > land" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=1210 if state=="ohio" & county=="ri > chland" & city_balance_total=="mansfield" & race=="total" (1 real change made) . replace births_of_residents_of_area__at1=55 if state=="ohio" & county=="rich > land" & city_balance_total=="balance of county" & race=="total" (1 real change made) . replace births__total=531 if state=="ohio" & county=="ross" & city_balance_t > otal=="chillicothe" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=416 if state=="ohio" & county=="ros > s" & city_balance_total=="balance of county" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=35 if state=="ohio" & county=="sand > usky" & city_balance_total=="total" & race=="total" (1 real change made) . replace births__total=910 if state=="south carolina" & county=="berkeley" & > city_balance_total=="total" & race=="total" (1 real change made) . replace births__total=675 if state=="south carolina" & county=="berkeley" & > city_balance_total=="total" & race=="nonwhite" (1 real change made) . replace births__total=469 if state=="south carolina" & county=="calhoun" & c > ity_balance_total=="total" & race=="total" (1 real change made) . replace births__total=80 if state=="south carolina" & county=="calhoun" & ci > ty_balance_total=="total" & race=="white" (1 real change made) . replace births__total=389 if state=="south carolina" & county=="calhoun" & c > ity_balance_total=="total" & race=="nonwhite" (1 real change made) . replace births_of_residents_of_area__at0=735 if state=="south carolina" & co > unty=="charleston" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=60 if state=="south carolina" & cou > nty=="charleston" & city_balance_total=="total" & race=="white" (1 real change made) . replace births_of_residents_of_area__at1=40 if state=="south carolina" & cou > nty=="charleston" & city_balance_total=="total" & race=="white" (1 real change made) . replace births_of_residents_of_area__att=657 if state=="south carolina" & co > unty=="charleston" & city_balance_total=="total" & race=="nonwhite" (1 real change made) . replace births_of_residents_of_area__at0=675 if state=="south carolina" & co > unty=="charleston" & city_balance_total=="total" & race=="nonwhite" (1 real change made) . replace births_of_residents_of_area__att=1681 if state=="south carolina" & c > ounty=="charleston" & city_balance_total=="charleston" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=662 if state=="south carolina" & co > unty=="charleston" & city_balance_total=="charleston" & race=="total" (1 real change made) . replace births_of_residents_of_area__at1=132 if state=="south carolina" & co > unty=="charleston" & city_balance_total=="charleston" & race=="total" (1 real change made) . replace births__total=1447 if state=="south carolina" & county=="charleston" > & city_balance_total=="charleston" & race=="white" (1 real change made) . replace births_of_residents_of_area__at0=28 if state=="south carolina" & cou > nty=="charleston" & city_balance_total=="charleston" & race=="white" (1 real change made) . replace births__total=1028 if state=="south carolina" & county=="charleston" > & city_balance_total=="charleston" & race=="nonwhite" (1 real change made) . replace births_of_residents_of_area__at0=634 if state=="south carolina" & co > unty=="charleston" & city_balance_total=="charleston" & race=="nonwhite" (1 real change made) . replace births_of_residents_of_area__at1=130 if state=="south carolina" & co > unty=="charleston" & city_balance_total=="charleston" & race=="nonwhite" (1 real change made) . replace births__total=2380 if state=="south carolina" & county=="charleston" > & city_balance_total=="balance of county" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=1253 if state=="south carolina" & c > ounty=="charleston" & city_balance_total=="balance of county" & race=="total > " (1 real change made) . replace births_of_residents_of_area__at0=73 if state=="south carolina" & cou > nty=="charleston" & city_balance_total=="balance of county" & race=="total" (1 real change made) . replace births_of_residents_of_area__at1=1053 if state=="south carolina" & c > ounty=="charleston" & city_balance_total=="balance of county" & race=="total > " (1 real change made) . replace births__total=930 if state=="south carolina" & county=="charleston" > & city_balance_total=="balance of county" & race=="white" (1 real change made) . replace births_of_residents_of_area__att=860 if state=="south carolina" & co > unty=="charleston" & city_balance_total=="balance of county" & race=="white" (1 real change made) . replace births_of_residents_of_area__at0=32 if state=="south carolina" & cou > nty=="charleston" & city_balance_total=="balance of county" & race=="white" (1 real change made) . replace births_of_residents_of_area__at1=38 if state=="south carolina" & cou > nty=="charleston" & city_balance_total=="balance of county" & race=="white" (1 real change made) . replace births__total=1450 if state=="south carolina" & county=="charleston" > & city_balance_total=="balance of county" & race=="nonwhite" (1 real change made) . replace births_of_residents_of_area__att=393 if state=="south carolina" & co > unty=="charleston" & city_balance_total=="balance of county" & race=="nonwhi > te" (1 real change made) . replace births__total=851 if state=="south carolina" & county=="cherokee" & > city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=610 if state=="south carolina" & co > unty=="cherokee" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=208 if state=="south carolina" & co > unty=="cherokee" & city_balance_total=="total" & race=="total" (1 real change made) . replace births__total=627 if state=="south carolina" & county=="cherokee" & > city_balance_total=="total" & race=="white" (1 real change made) . replace births_of_residents_of_area__att=492 if state=="south carolina" & co > unty=="cherokee" & city_balance_total=="total" & race=="white" (1 real change made) . replace births_of_residents_of_area__at0=130 if state=="south carolina" & co > unty=="cherokee" & city_balance_total=="total" & race=="white" (1 real change made) . replace births__total=224 if state=="south carolina" & county=="cherokee" & > city_balance_total=="total" & race=="nonwhite" (1 real change made) . replace births_of_residents_of_area__att=118 if state=="south carolina" & co > unty=="cherokee" & city_balance_total=="total" & race=="nonwhite" (1 real change made) . replace births_of_residents_of_area__at0=78 if state=="south carolina" & cou > nty=="cherokee" & city_balance_total=="total" & race=="nonwhite" (1 real change made) . replace births_of_residents_of_area__at1=28 if state=="south carolina" & cou > nty=="cherokee" & city_balance_total=="total" & race=="nonwhite" (1 real change made) . replace births__total=831 if state=="south carolina" & county=="chester" & c > ity_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=393 if state=="south carolina" & co > unty=="chester" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=115 if state=="south carolina" & co > unty=="chester" & city_balance_total=="total" & race=="total" (1 real change made) . replace births__total=411 if state=="south carolina" & county=="chester" & c > ity_balance_total=="total" & race=="white" (1 real change made) . replace births_of_residents_of_area__att=365 if state=="south carolina" & co > unty=="chester" & city_balance_total=="total" & race=="white" (1 real change made) . replace births__total=420 if state=="south carolina" & county=="chester" & c > ity_balance_total=="total" & race=="nonwhite" (1 real change made) . replace births_of_residents_of_area__att=28 if state=="south carolina" & cou > nty=="chester" & city_balance_total=="total" & race=="nonwhite" (1 real change made) . replace births__total=988 if state=="south carolina" & county=="chesterfield > " & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=287 if state=="south carolina" & co > unty=="chesterfield" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=568 if state=="south carolina" & co > unty=="chesterfield" & city_balance_total=="total" & race=="total" (1 real change made) . replace births__total=545 if state=="south carolina" & county=="chesterfield > " & city_balance_total=="total" & race=="white" (1 real change made) . replace births_of_residents_of_area__att=262 if state=="south carolina" & co > unty=="chesterfield" & city_balance_total=="total" & race=="white" (1 real change made) . replace births_of_residents_of_area__at0=276 if state=="south carolina" & co > unty=="chesterfield" & city_balance_total=="total" & race=="white" (1 real change made) . replace births__total=443 if state=="south carolina" & county=="chesterfield > " & city_balance_total=="total" & race=="nonwhite" (1 real change made) . replace births_of_residents_of_area__att=25 if state=="south carolina" & cou > nty=="chesterfield" & city_balance_total=="total" & race=="nonwhite" (1 real change made) . replace births_of_residents_of_area__at0=292 if state=="south carolina" & co > unty=="chesterfield" & city_balance_total=="total" & race=="nonwhite" (1 real change made) . replace births_of_residents_of_area__at1=126 if state=="south carolina" & co > unty=="chesterfield" & city_balance_total=="total" & race=="nonwhite" (1 real change made) . replace births__total=929 if state=="south carolina" & county=="clarendon" & > city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=160 if state=="south carolina" & co > unty=="clarendon" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=71 if state=="south carolina" & cou > nty=="clarendon" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__at1=697 if state=="south carolina" & co > unty=="clarendon" & city_balance_total=="total" & race=="total" (1 real change made) . replace births__total=212 if state=="south carolina" & county=="clarendon" & > city_balance_total=="total" & race=="white" (1 real change made) . replace births_of_residents_of_area__att=140 if state=="south carolina" & co > unty=="clarendon" & city_balance_total=="total" & race=="white" (1 real change made) . replace births_of_residents_of_area__at0=55 if state=="south carolina" & cou > nty=="clarendon" & city_balance_total=="total" & race=="white" (1 real change made) . replace births_of_residents_of_area__at1=17 if state=="south carolina" & cou > nty=="clarendon" & city_balance_total=="total" & race=="white" (1 real change made) . replace births__total=717 if state=="south carolina" & county=="clarendon" & > city_balance_total=="total" & race=="nonwhite" (1 real change made) . replace births_of_residents_of_area__att=20 if state=="south carolina" & cou > nty=="clarendon" & city_balance_total=="total" & race=="nonwhite" (1 real change made) . replace births_of_residents_of_area__at0=16 if state=="south carolina" & cou > nty=="clarendon" & city_balance_total=="total" & race=="nonwhite" (1 real change made) . replace births_of_residents_of_area__at1=680 if state=="south carolina" & co > unty=="clarendon" & city_balance_total=="total" & race=="nonwhite" (1 real change made) . replace county="colleton" if _n==5825 (1 real change made) . replace city_balance_total="total" if _n==5825 (1 real change made) . replace race="total" if _n==5825 (1 real change made) . replace births__total=777 if _n==5825 (1 real change made) . replace births_of_residents_of_area__att=336 if _n==5825 (1 real change made) . replace births_of_residents_of_area__at0=74 if _n==5825 (1 real change made) . replace births_of_residents_of_area__at1=366 if _n==5825 (1 real change made) . replace county="colleton" if _n==5826 (1 real change made) . replace city_balance_total="total" if _n==5826 (1 real change made) . replace race="white" if _n==5826 (1 real change made) . replace births__total=335 if _n==5826 (1 real change made) . replace births_of_residents_of_area__att=298 if _n==5826 (1 real change made) . replace births_of_residents_of_area__at0=28 if _n==5826 (1 real change made) . replace births_of_residents_of_area__at1=9 if _n==5826 (1 real change made) . replace county="colleton" if _n==5827 (1 real change made) . replace city_balance_total="total" if _n==5827 (1 real change made) . replace race="nonwhite" if _n==5827 (1 real change made) . replace births__total=442 if _n==5827 (1 real change made) . replace births_of_residents_of_area__att=38 if _n==5827 (1 real change made) . replace births_of_residents_of_area__at0=46 if _n==5827 (1 real change made) . replace births_of_residents_of_area__at1=357 if _n==5827 (1 real change made) . replace county="darlington" if _n==5828 (1 real change made) . replace city_balance_total="total" if _n==5828 (1 real change made) . replace race="total" if _n==5828 (1 real change made) . replace births__total=1512 if _n==5828 (1 real change made) . replace births_of_residents_of_area__att=610 if _n==5828 (1 real change made) . replace births_of_residents_of_area__at0=487 if _n==5828 (1 real change made) . replace births_of_residents_of_area__at1=415 if _n==5828 (1 real change made) . replace county="darlington" if _n==5829 (1 real change made) . replace city_balance_total="total" if _n==5829 (1 real change made) . replace race="white" if _n==5829 (1 real change made) . replace births__total=774 if _n==5829 (1 real change made) . replace births_of_residents_of_area__att=533 if _n==5829 (1 real change made) . replace births_of_residents_of_area__at0=237 if _n==5829 (1 real change made) . replace births_of_residents_of_area__at1=4 if _n==5829 (1 real change made) . replace county="darlington" if _n==5830 (1 real change made) . replace city_balance_total="total" if _n==5830 (1 real change made) . replace race="nonwhite" if _n==5830 (1 real change made) . replace births__total=738 if _n==5830 (1 real change made) . replace births_of_residents_of_area__att=77 if _n==5830 (1 real change made) . replace births_of_residents_of_area__at0=250 if _n==5830 (1 real change made) . replace births_of_residents_of_area__at1=411 if _n==5830 (1 real change made) . replace county="dillon" if _n==5831 (1 real change made) . replace city_balance_total="total" if _n==5831 (1 real change made) . replace race="total" if _n==5831 (1 real change made) . replace births__total=1048 if _n==5831 (1 real change made) . replace births_of_residents_of_area__att=616 if _n==5831 (1 real change made) . replace births_of_residents_of_area__at0=72 if _n==5831 (1 real change made) . replace births_of_residents_of_area__at1=359 if _n==5831 (1 real change made) . replace county="dillon" if _n==5832 (1 real change made) . replace city_balance_total="total" if _n==5832 (1 real change made) . replace race="white" if _n==5832 (1 real change made) . replace births__total=519 if _n==5832 (1 real change made) . replace births_of_residents_of_area__att=461 if _n==5832 (1 real change made) . replace births_of_residents_of_area__at0=36 if _n==5832 (1 real change made) . replace births_of_residents_of_area__at1=22 if _n==5832 (1 real change made) . replace county="dillon" if _n==5833 (1 real change made) . replace city_balance_total="total" if _n==5833 (1 real change made) . replace race="nonwhite" if _n==5833 (1 real change made) . replace births__total=529 if _n==5833 (1 real change made) . replace births_of_residents_of_area__att=155 if _n==5833 (1 real change made) . replace births_of_residents_of_area__at0=36 if _n==5833 (1 real change made) . replace births_of_residents_of_area__at1=337 if _n==5833 (1 real change made) . replace county="dorchester" if _n==5834 (1 real change made) . replace city_balance_total="total" if _n==5834 (1 real change made) . replace race="total" if _n==5834 (1 real change made) . replace births__total=670 if _n==5834 (1 real change made) . replace births_of_residents_of_area__att=199 if _n==5834 (1 real change made) . replace births_of_residents_of_area__at0=111 if _n==5834 (1 real change made) . replace births_of_residents_of_area__at1=360 if _n==5834 (1 real change made) . replace county="dorchester" if _n==5835 (1 real change made) . replace city_balance_total="total" if _n==5835 (1 real change made) . replace race="white" if _n==5835 (1 real change made) . replace births__total=245 if _n==5835 (1 real change made) . replace births_of_residents_of_area__att=170 if _n==5835 (1 real change made) . replace births_of_residents_of_area__at0=51 if _n==5835 (1 real change made) . replace births_of_residents_of_area__at1=24 if _n==5835 (1 real change made) . replace county="dorchester" if _n==5836 (1 real change made) . replace city_balance_total="total" if _n==5836 (1 real change made) . replace race="nonwhite" if _n==5836 (1 real change made) . replace births__total=425 if _n==5836 (1 real change made) . replace births_of_residents_of_area__att=29 if _n==5836 (1 real change made) . replace births_of_residents_of_area__at0=60 if _n==5836 (1 real change made) . replace births_of_residents_of_area__at1=336 if _n==5836 (1 real change made) . replace county="edgefield" if _n==5837 (1 real change made) . replace city_balance_total="total" if _n==5837 (1 real change made) . replace race="total" if _n==5837 (1 real change made) . replace births__total=485 if _n==5837 (1 real change made) . replace births_of_residents_of_area__att=90 if _n==5837 (1 real change made) . replace births_of_residents_of_area__at0=291 if _n==5837 (1 real change made) . replace births_of_residents_of_area__at1=104 if _n==5837 (1 real change made) . replace county="edgefield" if _n==5838 (1 real change made) . replace city_balance_total="total" if _n==5838 (1 real change made) . replace race="white" if _n==5838 (1 real change made) . replace births__total=123 if _n==5838 (1 real change made) . replace births_of_residents_of_area__att=76 if _n==5838 (1 real change made) . replace births_of_residents_of_area__at0=45 if _n==5838 (1 real change made) . replace births_of_residents_of_area__at1=2 if _n==5838 (1 real change made) . replace county="edgefield" if _n==5839 (1 real change made) . replace city_balance_total="total" if _n==5839 (1 real change made) . replace race="nonwhite" if _n==5839 (1 real change made) . replace births__total=362 if _n==5839 (1 real change made) . replace births_of_residents_of_area__att=14 if _n==5839 (1 real change made) . replace births_of_residents_of_area__at0=246 if _n==5839 (1 real change made) . replace births_of_residents_of_area__at1=102 if _n==5839 (1 real change made) . replace county="fairfield" if _n==5840 (1 real change made) . replace city_balance_total="total" if _n==5840 (1 real change made) . replace race="total" if _n==5840 (1 real change made) . replace births__total=603 if _n==5840 (1 real change made) . replace births_of_residents_of_area__att=132 if _n==5840 (1 real change made) . replace births_of_residents_of_area__at0=211 if _n==5840 (1 real change made) . replace births_of_residents_of_area__at1=260 if _n==5840 (1 real change made) . replace county="fairfield" if _n==5841 (1 real change made) . replace city_balance_total="total" if _n==5841 (1 real change made) . replace race="white" if _n==5841 (1 real change made) . replace births__total=219 if _n==5841 (1 real change made) . replace births_of_residents_of_area__att=107 if _n==5841 (1 real change made) . replace births_of_residents_of_area__at0=109 if _n==5841 (1 real change made) . replace births_of_residents_of_area__at1=3 if _n==5841 (1 real change made) . replace county="fairfield" if _n==5842 (1 real change made) . replace city_balance_total="total" if _n==5842 (1 real change made) . replace race="nonwhite" if _n==5842 (1 real change made) . replace births__total=384 if _n==5842 (1 real change made) . replace births_of_residents_of_area__att=25 if _n==5842 (1 real change made) . replace births_of_residents_of_area__at0=102 if _n==5842 (1 real change made) . replace births_of_residents_of_area__at1=257 if _n==5842 (1 real change made) . replace births_of_residents_of_area__att=917 if state=="south carolina" & co > unty=="florence" & city_balance_total=="total" & race=="total" (1 real change made) . replace births__total=1209 if state=="south carolina" & county=="florence" & > city_balance_total=="total" & race=="white" (1 real change made) . replace births_of_residents_of_area__att=831 if state=="south carolina" & co > unty=="florence" & city_balance_total=="total" & race=="white" (1 real change made) . replace births__total=802 if state=="south carolina" & county=="florence" & > city_balance_total=="balance of county" & race=="white" (1 real change made) . replace births_of_residents_of_area__at0=184 if state=="tennessee" & county= > ="macon" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__at1=27 if state=="texas" & county=="uva > lde" & city_balance_total=="total" & race=="total" (1 real change made) . replace births__total=6513 if state=="west virginia" & county=="kanawha" & c > ity_balance_total=="total" & race=="total" (1 real change made) . . *fix county name misspellings (relative to county names in the 1970 census) . replace county=subinstr(county,"st.","st",.) (90 real changes made) . replace county=subinstr(county,"ste.","st",.) (1 real change made) . replace county="desoto" if county=="de soto" & state=="florida" (3 real changes made) . replace county="cochrane" if county=="cochran" & state=="texas" (1 real change made) . replace county="san francisco" if county=="san francisco, coextensive with s > an francisco (city)" & state=="california" (3 real changes made) . replace county="orleans" if county=="orleans, coartenaive with new orleans" > & state=="louisiana" (3 real changes made) . replace county="denver" if county=="denver, coextensive with denver (city)" > & state=="colorado" (1 real change made) . replace county="philadelphia" if county=="philadelphia, coextensive with phi > ladelphia (city)" & state=="pennsylvania" (3 real changes made) . replace county="baltimore city" if county=="baltimore (city)" & state=="mary > land" (3 real changes made) . replace county="st louis city" if county=="st louis (city)" & state=="missou > ri" (3 real changes made) . replace county="chisago" if county=="chicago" & state=="minnesota" (1 real change made) . . *correct data entry errors found while checking that county totals sum to st > ate totals . replace births__total=1038 if births__total==1039 & state=="ke > ntucky" & county=="mccracken" & city_balance_total=="total" & > race=="total" (1 real change made) . replace births__total=187 if births__total==167 & state=="mi > ssouri" & county=="cedar" & city_balance_total=="t > otal" & race=="total" (1 real change made) . replace births__total=833 if births__total==633 & state=="mi > ssouri" & county=="st francois" & city_balance_total=="total" & > race=="total" (1 real change made) . replace births__total=2735 if births__total==2736 & state=="oh > io" & county=="clark" & city_balance_total=="t > otal" & race=="total" (1 real change made) . replace births__total=1869 if births__total==1669 & state=="te > nnessee" & county=="anderson" & city_balance_total=="total" & race=="t > otal" (1 real change made) . replace births__total=398 if births__total==396 & state=="te > nnessee" & county=="grundy" & city_balance_total=="total" & > race=="total" (1 real change made) . replace births__total=2528 if births__total==2529 & state=="wa > shington" & county=="snohomish" & city_balance_total=="total" & race=="t > otal" (1 real change made) . replace births_of_residents_of_area__att=6309 if births_of_residents_of_ar > ea__att==6308 & state=="california" & county=="san bernardin > o" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=408 if births_of_residents_of_ar > ea__att==409 & state=="kansas" & county=="dicki > nson" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=80 if births_of_residen > ts_of_area__att==60 & state=="kentucky" & county=="lewis > " & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=859 if births_of_residents_of_ar > ea__att==659 & state=="louisiana" & county=="webster" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=2879 if births_of_residents_of_ar > ea__att==2679 & state=="massachusetts" & county=="berkshire" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=2137 if births_of_residents_of_ar > ea__att==2157 & state=="mississippi" & county=="hinds" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=686 if births_of_residents_of_ar > ea__att==688 & state=="north carolina" & county=="henderson" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=1035 if births_of_residents_of_ar > ea__att==35 & state=="ohio" & county=="sandu > sky" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=10560 if births_of_residents_of_ar > ea__att==11560 & state=="oregon" & county=="multn > omah" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=968 if births_of_residents_of_ar > ea__att==868 & state=="texas" & county=="ector > " & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=849 if births_of_residents_of_ar > ea__att==649 & state=="west virginia" & county=="logan" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=468 if births_of_residents_of_ar > ea__att==469 & state=="wisconsin" & county=="door" > & city_balance_total=="total" & race=="total" (1 real change made) . . *correct data entry errors found while checking that white+nonwhite=total . replace births_of_residents_of_area__att=432 if births_of_residents_of_ar > ea__att==452 & state=="mississippi" & county=="hinds" > & race=="nonwhite" & city_balance_total=="total" (1 real change made) . replace births_of_residents_of_area__att=818 if births_of_residents_of_ar > ea__att==618 & state=="west virginia" & county=="logan" > & race=="white" & city_balance_total=="total" (1 real change made) . . *check that county names are consistent with 1970 census, except for known d > eviations . preserve . egen countyr_state=concat(county state), punct("_") . sort countyr_state . merge countyr_state using original_census1970_counties variable countyr_state does not uniquely identify observations in the master data variable countyr_state does not uniquely identify observations in original_census1970_counties.dta . assert _m==3|county=="armstrong" & state=="south dakota"|county=="washington > " & state=="south dakota"|county=="yellowstone national park (part)" & state > =="idaho"|county=="yellowstone national park" & state=="idaho"|county=="yell > owstone national park (part)" & state=="montana"|county=="yellowstone nation > al park (part)" & state=="wyoming"|county=="yellowstone national park (total > )" & state=="wyoming"|county=="yellowstone national park" & state=="montana" > |county=="park (excl yell nat park)" & state=="montana"|county=="yellowstone > national park, part" & state=="montana"|county=="yellowstone national park, > part" & state=="wyoming"|county=="yellowstone national park, part" & state= > ="montana"|county=="yellowstone national park, total" & state=="wyoming"|cou > nty=="park" & state=="montana"|county=="yellowstone nat. park (part)" & stat > e=="idaho"|county=="yellowstone nat. park (part)" & state=="wyoming"|county= > ="yellowstone nat. park (total)" & state=="wyoming"|county=="new york city" > & state=="new york"|county=="bronx" & state=="new york"|county=="kings" & st > ate=="new york"|county=="new york" & state=="new york"|county=="queens" & st > ate=="new york"|county=="richmond" & state=="new york"|county=="ormsby" & st > ate=="nevada"|county=="carson city" & state=="nevada"|county=="los alamos" & > state=="new mexico"|county=="menominee" & state=="wisconsin"|county=="total > "|state=="alaska"|state=="hawaii"|state=="virginia"|state=="dc" . restore . . *check for mistakes/misspellings in state names . assert state=="alabama"|state=="alaska"|state=="arizona"|state=="arkansas"|s > tate=="california"|state=="colorado"|state=="connecticut"|state=="delaware"| > state=="district of columbia"|state=="florida"|state=="georgia"|state=="hawa > ii"|state=="idaho"|state=="illinois"|state=="indiana"|state=="iowa"|state==" > kansas"|state=="kentucky"|state=="louisiana"|state=="maine"|state=="maryland > "|state=="massachusetts"|state=="michigan"|state=="minnesota"|state=="missis > sippi"|state=="missouri"|state=="montana"|state=="nebraska"|state=="nevada"| > state=="new hampshire"|state=="new jersey"|state=="new mexico"|state=="new y > ork"|state=="north carolina"|state=="north dakota"|state=="ohio"|state=="okl > ahoma"|state=="oregon"|state=="pennsylvania"|state=="rhode island"|state=="s > outh carolina"|state=="south dakota"|state=="tennessee"|state=="texas"|state > =="utah"|state=="vermont"|state=="virginia"|state=="washington"|state=="west > virginia"|state=="wisconsin"|state=="wyoming" . . *data checks for columns summing to county total . *not applicable; see footnote 1 . . *clean and label variables . rename page__of_pdf_ page_of_pdf_ . label var page_of_pdf_ "page of pdf" . label var state "state" . label var county "county" . rename city_balance_total sub_county . label var sub_county "city/balance/total" . label var race "race" . rename births__total births . label var births "births by residence" . rename births_of_residents_of_area__att births_h_p . label var births_h_p "births by residence: physician in hospital" . rename births_of_residents_of_area__at0 births_nh_p . label var births_nh_p "births by residence: physician not in hospital" . rename births_of_residents_of_area__at1 births_m . label var births_m "births by residence: midwife" . . *generate year variable . gen year=1948 . label var year "year" . . *check that observations are unique . egen tag=tag(state county sub_county race) . assert tag==1 . drop tag . . list in 1/1 +--------------------------------------------------------------------+ 1. | page_o~_ | state | county | sub_co~y | race | births | bir~_h_p | | 2 | alabama | total | total | total | 85372 | 48310 | |--------------------------------------------------------------------| | bir~nh_p | births_m | year | | 17838 | 18798 | 1948 | +--------------------------------------------------------------------+ . desc Contains data from natality1948.dta obs: 7,611 vars: 10 size: 966,597 (90.8% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page_of_pdf_ byte %8.0g page of pdf state str20 %20s state county str52 %52s county sub_county str26 %26s city/balance/total race str8 %8s race births long %12.0g births by residence births_h_p long %12.0g births by residence: physician in hospital births_nh_p int %8.0g births by residence: physician not in hospital births_m int %8.0g births by residence: midwife year float %9.0g year ------------------------------------------------------------------------------ > - Sorted by: page_of_pdf_ Note: dataset has changed since last saved . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page_of_pdf_ | 7611 47.34227 26.56087 2 93 state | 0 county | 0 sub_county | 0 race | 0 -------------+-------------------------------------------------------- births | 7611 1941.497 11126.39 0 301966 births_h_p | 7611 1666.047 10206.29 0 295315 births_nh_p | 7611 168.2745 1030.209 0 26067 births_m | 7611 90.56786 718.7125 0 21168 year | 7611 1948 0 1948 1948 . saveold clean_natality1948.dta,replace file clean_natality1948.dta saved . clear . . ** . *1949 data . ** . *http://nber15.nber.org/vital-stats-books/vsus_1949_2.cv.pdf . *table 1 . . *hand-checked excel file for ?'s (difficult-to-read characters) . *corrected below if any . . clear . use natality1949.dta . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page__of_p~_ | 7612 46.70809 26.40633 2 93 state | 0 county | 0 city_balan~l | 0 race | 0 -------------+-------------------------------------------------------- births__to~l | 7603 1960.413 11184.21 1 301287 births_of_~t | 7588 1706.802 10352.17 1 295656 births_of_~0 | 7368 156.6223 939.4865 1 23322 births_of_~1 | 4872 151.4421 977.1157 1 23327 births_of_~2 | 4274 14.5 86.36937 1 2127 . desc Contains data from natality1949.dta obs: 7,612 vars: 10 size: 951,500 (90.9% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page__of_pdf_ byte %8.0g state str20 %20s county str52 %52s city_balance_~l str26 %26s race str8 %8s births__total long %12.0g births_of_res~t long %12.0g births_of_res~0 int %8.0g births_of_res~1 int %8.0g births_of_res~2 int %8.0g ------------------------------------------------------------------------------ > - Sorted by: . replace county=lower(county) (7612 real changes made) . replace state=lower(state) (7612 real changes made) . replace city_balance_total=lower(city_balance_total) (7612 real changes made) . replace race=lower(race) (7612 real changes made) . . *missing observations in the original data are zeros . *replace these so we can distinguish these from "true" missing values . *which would be generated below due to difficult-to-read characters . count if births__total==. 9 . replace births__total=0 if births__total ==. (9 real changes made) . count if births_of_residents_of_area__att==. 24 . replace births_of_residents_of_area__att =0 if births_of_residents_of_area > __att ==. (24 real changes made) . count if births_of_residents_of_area__at0==. 244 . replace births_of_residents_of_area__at0 =0 if births_of_residents_of_area > __at0 ==. (244 real changes made) . count if births_of_residents_of_area__at1==. 2740 . replace births_of_residents_of_area__at1 =0 if births_of_residents_of_area > __at1 ==. (2740 real changes made) . count if births_of_residents_of_area__at2==. 3338 . replace births_of_residents_of_area__at2 =0 if births_of_residents_of_area > __at2 ==. (3338 real changes made) . . *check that all pdf pages appear to be in the data . sort page__of_pdf_ . gen temp=page__of_pdf_[_n]-page__of_pdf_[_n-1] (1 missing value generated) . assert temp==0|temp==1|temp==. . drop temp . . *clean data entry errors . replace county="oscoda" if county=="osceola" & state=="michigan" & births__t > otal==69 (1 real change made) . replace births_of_residents_of_area__att=889 if county=="tolland" & state==" > connecticut" & births_of_residents_of_area__att==869 (1 real change made) . replace births_of_residents_of_area__at2=1 if county=="bay" & state=="florid > a" & city_balance_total=="panama city" & race=="total" & births_of_residents > _of_area__at2==0 (1 real change made) . replace births_of_residents_of_area__at2=1 if county=="bay" & state=="florid > a" & city_balance_total=="panama city" & race=="nonwhite" & births_of_reside > nts_of_area__at2==0 (1 real change made) . replace births__total=448 if county=="sullivan" & state=="indiana" & births_ > _total==440 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="total" & births__total==5051 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="white" & births__total==3337 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="nonwhite" & births__total==1714 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="total" & births__total==5280 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="white" & births__total==3282 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="nonwhite" & births__total==1998 (1 real change made) . replace county="roanoke (ind. city)" if county=="roanoke" & state=="virginia > " & race=="total" & births__total==1939 (1 real change made) . replace county="roanoke (ind. city)" if county=="roanoke" & state=="virginia > " & race=="white" & births__total==1517 (1 real change made) . replace county="roanoke (ind. city)" if county=="roanoke" & state=="virginia > " & race=="nonwhite" & births__total==422 (1 real change made) . . *fix county name misspellings (relative to county names in the 1970 census) . replace county=subinstr(county,"st.","st",.) (90 real changes made) . replace county=subinstr(county,"ste.","st",.) (1 real change made) . replace county="desoto" if county=="de soto" & state=="florida" (3 real changes made) . replace county="cochrane" if county=="cochran" & state=="texas" (1 real change made) . replace county="san francisco" if county=="san francisco, coextensive with s > an francisco (city)" & state=="california" (3 real changes made) . replace county="orleans" if county=="orleans, coartenaive with new orleans" > & state=="louisiana" (3 real changes made) . replace county="denver" if county=="denver, coextensive with denver (city)" > & state=="colorado" (1 real change made) . replace county="philadelphia" if county=="philadelphia, coextensive with phi > ladelphia (city)" & state=="pennsylvania" (3 real changes made) . replace county="baltimore city" if county=="baltimore (city)" & state=="mary > land" (3 real changes made) . replace county="st louis city" if county=="st louis (city)" & state=="missou > ri" (3 real changes made) . replace county="chisago" if county=="chicago" & state=="minnesota" (1 real change made) . . *correct data entry errors found while checking that white+nonwhite=total . replace births__total=345 if births__total==63 & state=="arkansas" > & county=="hot spring" & race=="white" & city_balance_total=="t > otal" (1 real change made) . replace births__total=63 if births__total==334 & state=="arkansas" > & county=="hot spring" & race=="nonwhite" & city_balance_total=="t > otal" (1 real change made) . replace births_of_residents_of_area__att=282 if births_of_residents_of_area_ > _att==3 & state=="arkansas" & county=="hot spring" & race== > "white" & city_balance_total=="total" (1 real change made) . replace births_of_residents_of_area__att=3 if births_of_residents_of_a > rea__att==221 & state=="arkansas" & county=="hot spring" & race== > "nonwhite" & city_balance_total=="total" (1 real change made) . replace births_of_residents_of_area__at0=61 if births_of_residents_of_area_ > _at0==16 & state=="arkansas" & county=="hot spring" & race=="white" > & city_balance_total=="total" (1 real change made) . replace births_of_residents_of_area__at0=16 if births_of_residents_of_area_ > _at0==71 & state=="arkansas" & county=="hot spring" & race=="nonwhit > e" & city_balance_total=="total" (1 real change made) . replace births_of_residents_of_area__at1=1 if births_of_residents_of_area_ > _at1==42 & state=="arkansas" & county=="hot spring" & race=="white" > & city_balance_total=="total" (1 real change made) . replace births_of_residents_of_area__at1=42 if births_of_residents_of_area_ > _at1==35 & state=="arkansas" & county=="hot spring" & race=="nonwhit > e" & city_balance_total=="total" (1 real change made) . replace births_of_residents_of_area__at2=1 if births_of_residents_of_area_ > _at2==2 & state=="arkansas" & county=="hot spring" & race== > "white" & city_balance_total=="total" (1 real change made) . replace births_of_residents_of_area__at2=2 if births_of_residents_of_a > rea__at2==7 & state=="arkansas" & county=="hot spring" & race== > "nonwhite" & city_balance_total=="total" (1 real change made) . . *check that county names are consistent with 1970 census, except for known d > eviations . preserve . egen countyr_state=concat(county state), punct("_") . sort countyr_state . merge countyr_state using original_census1970_counties variable countyr_state does not uniquely identify observations in the master data variable countyr_state does not uniquely identify observations in original_census1970_counties.dta . assert _m==3|county=="armstrong" & state=="south dakota"|county=="washington > " & state=="south dakota"|county=="yellowstone national park (part)" & state > =="idaho"|county=="yellowstone national park" & state=="idaho"|county=="yell > owstone national park (part)" & state=="montana"|county=="yellowstone nation > al park (part)" & state=="wyoming"|county=="yellowstone national park (total > )" & state=="wyoming"|county=="yellowstone national park" & state=="montana" > |county=="park (excl yell nat park)" & state=="montana"|county=="yellowstone > national park, part" & state=="montana"|county=="yellowstone national park, > part" & state=="wyoming"|county=="yellowstone national park, part" & state= > ="montana"|county=="yellowstone national park, total" & state=="wyoming"|cou > nty=="park" & state=="montana"|county=="yellowstone nat. park (part)" & stat > e=="idaho"|county=="yellowstone nat. park (part)" & state=="wyoming"|county= > ="yellowstone nat. park (total)" & state=="wyoming"|county=="new york city" > & state=="new york"|county=="bronx" & state=="new york"|county=="kings" & st > ate=="new york"|county=="new york" & state=="new york"|county=="queens" & st > ate=="new york"|county=="richmond" & state=="new york"|county=="ormsby" & st > ate=="nevada"|county=="carson city" & state=="nevada"|county=="los alamos" & > state=="new mexico"|county=="menominee" & state=="wisconsin"|county=="total > "|state=="alaska"|state=="hawaii"|state=="virginia"|state=="dc" . restore . . *check for mistakes/misspellings in state names . assert state=="alabama"|state=="alaska"|state=="arizona"|state=="arkansas"|s > tate=="california"|state=="colorado"|state=="connecticut"|state=="delaware"| > state=="district of columbia"|state=="florida"|state=="georgia"|state=="hawa > ii"|state=="idaho"|state=="illinois"|state=="indiana"|state=="iowa"|state==" > kansas"|state=="kentucky"|state=="louisiana"|state=="maine"|state=="maryland > "|state=="massachusetts"|state=="michigan"|state=="minnesota"|state=="missis > sippi"|state=="missouri"|state=="montana"|state=="nebraska"|state=="nevada"| > state=="new hampshire"|state=="new jersey"|state=="new mexico"|state=="new y > ork"|state=="north carolina"|state=="north dakota"|state=="ohio"|state=="okl > ahoma"|state=="oregon"|state=="pennsylvania"|state=="rhode island"|state=="s > outh carolina"|state=="south dakota"|state=="tennessee"|state=="texas"|state > =="utah"|state=="vermont"|state=="virginia"|state=="washington"|state=="west > virginia"|state=="wisconsin"|state=="wyoming" . . *data checks for columns summing to county total . gen temp=births_of_residents_of_area__att+ births_of_residents_of_area__at0 > + births_of_residents_of_area__at1 + births_of_residents_of_area__at2 . assert temp==births__total . drop temp . . *clean and label variables . rename page__of_pdf_ page_of_pdf_ . label var page_of_pdf_ "page of pdf" . label var state "state" . label var county "county" . rename city_balance_total sub_county . label var sub_county "city/balance/total" . label var race "race" . rename births__total births . label var births "births by residence" . rename births_of_residents_of_area__att births_h_p . label var births_h_p "births by residence: physician in hospital" . rename births_of_residents_of_area__at0 births_nh_p . label var births_nh_p "births by residence: physician not in hospital" . rename births_of_residents_of_area__at1 births_m . label var births_m "births by residence: midwife" . rename births_of_residents_of_area__at2 births_o . label var births_o "births by residence: other and not specified" . . *generate year variable . gen year=1949 . label var year "year" . . *check that observations are unique . egen tag=tag(state county sub_county race) . assert tag==1 . drop tag . . list in 1/1 +--------------------------------------------------------------------+ 1. | page_o~_ | state | county | sub_co~y | race | births | bir~_h_p | | 2 | alabama | total | total | total | 84418 | 49301 | |--------------------------------------------------------------------| | bir~nh_p | births_m | births_o | year | | 15926 | 18673 | 518 | 1949 | +--------------------------------------------------------------------+ . desc Contains data from natality1949.dta obs: 7,612 vars: 11 size: 981,948 (90.6% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page_of_pdf_ byte %8.0g page of pdf state str20 %20s state county str52 %52s county sub_county str26 %26s city/balance/total race str8 %8s race births long %12.0g births by residence births_h_p long %12.0g births by residence: physician in hospital births_nh_p int %8.0g births by residence: physician not in hospital births_m int %8.0g births by residence: midwife births_o int %8.0g births by residence: other and not specified year float %9.0g year ------------------------------------------------------------------------------ > - Sorted by: page_of_pdf_ Note: dataset has changed since last saved . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page_of_pdf_ | 7612 46.70809 26.40633 2 93 state | 0 county | 0 sub_county | 0 race | 0 -------------+-------------------------------------------------------- births | 7612 1958.098 11177.8 0 301287 births_h_p | 7612 1701.431 10336.28 0 295656 births_nh_p | 7612 151.6005 924.7162 0 23322 births_m | 7612 96.92486 785.0627 0 23327 births_o | 7612 8.140962 65.11387 0 2127 -------------+-------------------------------------------------------- year | 7612 1949 0 1949 1949 . saveold clean_natality1949.dta,replace file clean_natality1949.dta saved . clear . . ** . *1950 data . ** . *http://nber15.nber.org/vital-stats-books/vsus_1950_2.cv.pdf . *table 13 . . *hand-checked excel file for ?'s (difficult-to-read characters) . *corrected below if any . . clear . use natality1950.dta . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page__of_p~_ | 8255 86.22399 29.18137 36 137 state | 0 county | 0 city_balan~l | 0 race | 0 -------------+-------------------------------------------------------- births__to~l | 8246 1975.315 10873.8 1 301043 births_of_~t | 8239 1755.31 10165.24 1 296061 births_of_~0 | 7893 131.0215 796.4763 1 21034 births_of_~1 | 5170 140.107 913.0164 1 23096 . desc Contains data from natality1950.dta obs: 8,255 vars: 9 size: 1,073,150 (89.8% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page__of_pdf_ int %8.0g state str20 %20s county str52 %52s city_balance_~l str32 %32s race str8 %8s births__total long %12.0g births_of_res~t long %12.0g births_of_res~0 int %8.0g births_of_res~1 int %8.0g ------------------------------------------------------------------------------ > - Sorted by: . replace county=lower(county) (8255 real changes made) . replace state=lower(state) (8255 real changes made) . replace city_balance_total=lower(city_balance_total) (8255 real changes made) . replace race=lower(race) (8255 real changes made) . . *missing observations in the original data are zeros . *replace these so we can distinguish these from "true" missing values . *which would be generated below due to difficult-to-read characters . count if births__total==. 9 . replace births__total=0 if births__total ==. (9 real changes made) . count if births_of_residents_of_area__att==. 16 . replace births_of_residents_of_area__att =0 if births_of_residents_of_area > __att ==. (16 real changes made) . count if births_of_residents_of_area__at0==. 362 . replace births_of_residents_of_area__at0 =0 if births_of_residents_of_area > __at0 ==. (362 real changes made) . count if births_of_residents_of_area__at1==. 3085 . replace births_of_residents_of_area__at1 =0 if births_of_residents_of_area > __at1 ==. (3085 real changes made) . . *check that all pdf pages appear to be in the data . sort page__of_pdf_ . gen temp=page__of_pdf_[_n]-page__of_pdf_[_n-1] (1 missing value generated) . assert temp==0|temp==1|temp==. . drop temp . . *clean data entry errors . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="total" & births__total==5041 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="white" & births__total==3349 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="nonwhite" & births__total==1692 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="total" & births__total==4937 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="white" & births__total==2919 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="nonwhite" & births__total==2018 (1 real change made) . replace county="roanoke (ind. city)" if county=="roanoke" & state=="virginia > " & race=="total" & births__total==2019 (1 real change made) . . *fix county name misspellings (relative to county names in the 1970 census) . replace county=subinstr(county,"st.","st",.) (110 real changes made) . replace county=subinstr(county,"ste.","st",.) (1 real change made) . replace county="cochrane" if county=="cochran" & state=="texas" (1 real change made) . replace county="desoto" if county=="de soto" & state=="florida" (3 real changes made) . replace county="de kalb" if county=="dekalb" & state=="tennessee" (1 real change made) . replace county="denver" if county=="denver, coextensive with denver (city)" > & state=="colorado" (1 real change made) . replace county="orleans" if county=="orleans, coextensive with new orleans" > & state=="louisiana" (3 real changes made) . replace county="philadelphia" if county=="philadelphia, coextensive with phi > ladelphia (city)" & state=="pennsylvania" (3 real changes made) . replace county="san francisco" if county=="san francisco, coextensive with s > an francisco (city)" & state=="california" (3 real changes made) . replace county="st louis city" if county=="st louis (city)" & state=="missou > ri" (3 real changes made) . replace county="baltimore city" if county=="baltimore (city)" & state=="mary > land" (3 real changes made) . . *correct data entry errors found while checking that county totals sum to st > ate totals . replace births__total=288 if births__total==289 & state=="arizona" > & county=="santa cruz" & city_balance_total=="total" & race=="total" (1 real change made) . replace births__total=289 if births__total==299 & state=="idaho" > & county=="payette" & city_balance_total=="total" & race=="t > otal" (1 real change made) . replace births__total=64 if births__total==84 & state=="kentucky" > & county=="robertson" & city_balance_total=="total" & race=="total" (1 real change made) . replace births__total=568 if births__total==566 & state=="michigan" > & county=="gogebic" & city_balance_total=="total" & race=="t > otal" (1 real change made) . replace births__total=2935 if births__total==2936 & state=="new york" > & county=="orange" & city_balance_total=="total" & race=="t > otal" (1 real change made) . replace births__total=402 if births__total==102 & state=="ohio" > & county=="harrison" & city_balance_total=="total" & race=="total" (1 real change made) . replace births__total=388 if births__total==386 & state=="ohio" > & county=="hocking" & city_balance_total=="total" & race=="t > otal" (1 real change made) . replace births__total=688 if births__total==686 & state=="oklahoma" > & county=="mccurtain" & city_balance_total=="total" & race=="total" (1 real change made) . replace births__total=162 if births__total==134 & state=="oregon" > & county=="wallowa" & city_balance_total=="total" & race=="t > otal" (1 real change made) . replace births__total=305 if births__total==306 & state=="texas" > & county=="erath" & city_balance_total=="total" & race=="t > otal" (1 real change made) . replace births__total=373 if births__total==375 & state=="texas" > & county=="frio" & city_balance_total=="total" & race=="t > otal" (1 real change made) . replace births__total=686 if births__total==696 & state=="texas" > & county=="nacogdoches" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=293 if births_of_residents_of_ar > ea__att==295 & state=="illinois" & county=="wayne" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=1736 if births_of_residents_of_ar > ea__att==1738 & state=="new york" & county=="oswego" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=161 if births_of_residents_of_ar > ea__att==133 & state=="oregon" & county=="wallowa" > & city_balance_total=="total" & race=="total" (1 real change made) . . *correct data entry errors found while checking that white+nonwhite=total . replace births__total=235 if births__total==236 & state=="alabama" > & county=="lauderdale" & race=="nonwhite" & city_balance_t > otal=="total" (1 real change made) . replace births__total=388 if births__total==386 & state=="kentucky" > & county=="boyle" & race=="white" & city_balance_t > otal=="total" (1 real change made) . replace births__total=88296 if births__total==86296 & state=="new jersey > " & county=="total" & race=="white" & city_balance_t > otal=="total" (1 real change made) . replace births__total=613 if births__total==513 & state=="north caro > lina" & county=="columbus" & race=="nonwhite" & city_balance_total=="t > otal" (1 real change made) . replace births__total=264 if births__total==284 & state=="texas" > & county=="robertson" & race=="nonwhite" & city_balance_t > otal=="total" (1 real change made) . replace births_of_residents_of_area__att=87 if births_of_residents_of_ar > ea__att==67 & state=="georgia" & county=="lamar > " & race=="white" & city_balance_total=="total" (1 real change made) . replace births_of_residents_of_area__att=246 if births_of_residents_of_ar > ea__att==245 & state=="north carolina" & county=="bladen" > & race=="white" & city_balance_total=="total" (1 real change made) . . *check that county names are consistent with 1970 census, except for known d > eviations . preserve . egen countyr_state=concat(county state), punct("_") . sort countyr_state . merge countyr_state using original_census1970_counties variable countyr_state does not uniquely identify observations in the master data variable countyr_state does not uniquely identify observations in original_census1970_counties.dta . assert _m==3|county=="armstrong" & state=="south dakota"|county=="washington > " & state=="south dakota"|county=="yellowstone national park (part)" & state > =="idaho"|county=="yellowstone national park" & state=="idaho"|county=="yell > owstone national park (part)" & state=="montana"|county=="yellowstone nation > al park (part)" & state=="wyoming"|county=="yellowstone national park (total > )" & state=="wyoming"|county=="yellowstone national park" & state=="montana" > |county=="park (excl yell nat park)" & state=="montana"|county=="yellowstone > national park, part" & state=="montana"|county=="yellowstone national park, > part" & state=="wyoming"|county=="yellowstone national park, part" & state= > ="montana"|county=="yellowstone national park, total" & state=="wyoming"|cou > nty=="park" & state=="montana"|county=="yellowstone nat. park (part)" & stat > e=="idaho"|county=="yellowstone nat. park (part)" & state=="wyoming"|county= > ="yellowstone nat. park (total)" & state=="wyoming"|county=="new york city" > & state=="new york"|county=="bronx" & state=="new york"|county=="kings" & st > ate=="new york"|county=="new york" & state=="new york"|county=="queens" & st > ate=="new york"|county=="richmond" & state=="new york"|county=="ormsby" & st > ate=="nevada"|county=="carson city" & state=="nevada"|county=="los alamos" & > state=="new mexico"|county=="menominee" & state=="wisconsin"|county=="total > "|state=="alaska"|state=="hawaii"|state=="virginia"|state=="dc" . restore . . *check for mistakes/misspellings in state names . assert state=="alabama"|state=="alaska"|state=="arizona"|state=="arkansas"|s > tate=="california"|state=="colorado"|state=="connecticut"|state=="delaware"| > state=="district of columbia"|state=="florida"|state=="georgia"|state=="hawa > ii"|state=="idaho"|state=="illinois"|state=="indiana"|state=="iowa"|state==" > kansas"|state=="kentucky"|state=="louisiana"|state=="maine"|state=="maryland > "|state=="massachusetts"|state=="michigan"|state=="minnesota"|state=="missis > sippi"|state=="missouri"|state=="montana"|state=="nebraska"|state=="nevada"| > state=="new hampshire"|state=="new jersey"|state=="new mexico"|state=="new y > ork"|state=="north carolina"|state=="north dakota"|state=="ohio"|state=="okl > ahoma"|state=="oregon"|state=="pennsylvania"|state=="rhode island"|state=="s > outh carolina"|state=="south dakota"|state=="tennessee"|state=="texas"|state > =="utah"|state=="vermont"|state=="virginia"|state=="washington"|state=="west > virginia"|state=="wisconsin"|state=="wyoming" . . *data checks for columns summing to county total . *not applicable; see footnote 1 . . *clean and label variables . rename page__of_pdf_ page_of_pdf_ . label var page_of_pdf_ "page of pdf" . label var state "state" . label var county "county" . rename city_balance_total sub_county . label var sub_county "city/balance/total" . label var race "race" . rename births__total births . label var births "births by residence" . rename births_of_residents_of_area__att births_h_p . label var births_h_p "births by residence: physician in hospital" . rename births_of_residents_of_area__at0 births_nh_p . label var births_nh_p "births by residence: physician not in hospital" . rename births_of_residents_of_area__at1 births_m . label var births_m "births by residence: midwife" . . *generate year variable . gen year=1950 . label var year "year" . . *check that observations are unique . egen tag=tag(state county sub_county race) . assert tag==1 . drop tag . . list in 1/1 +--------------------------------------------------------------------+ 1. | page_o~_ | state | county | sub_co~y | race | births | bir~_h_p | | 36 | alabama | total | total | total | 82616 | 49401 | |--------------------------------------------------------------------| | bir~nh_p | births_m | year | | 14204 | 18024 | 1950 | +--------------------------------------------------------------------+ . desc Contains data from natality1950.dta obs: 8,255 vars: 10 size: 1,106,170 (89.5% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page_of_pdf_ int %8.0g page of pdf state str20 %20s state county str52 %52s county sub_county str32 %32s city/balance/total race str8 %8s race births long %12.0g births by residence births_h_p long %12.0g births by residence: physician in hospital births_nh_p int %8.0g births by residence: physician not in hospital births_m int %8.0g births by residence: midwife year float %9.0g year ------------------------------------------------------------------------------ > - Sorted by: page_of_pdf_ Note: dataset has changed since last saved . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page_of_pdf_ | 8255 86.22399 29.18137 36 137 state | 0 county | 0 sub_county | 0 race | 0 -------------+-------------------------------------------------------- births | 8255 1973.448 10869.96 0 301043 births_h_p | 8255 1751.914 10155.68 0 296061 births_nh_p | 8255 125.276 779.2767 0 21034 births_m | 8255 87.74718 725.6924 0 23096 year | 8255 1950 0 1950 1950 . saveold clean_natality1950.dta,replace file clean_natality1950.dta saved . clear . . ** . *1951 data . ** . *http://nber15.nber.org/vital-stats-books/vsus_1951_1.cv.pdf . *table 17 . . *hand-checked excel file for ?'s (difficult-to-read characters) . *corrected below if any . . clear . use natality1951.dta . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page__of_p~_ | 5146 162.7919 18.41237 131 194 state | 0 county | 0 race | 0 births__to~l | 5140 2594.938 14304.84 2 316580 -------------+-------------------------------------------------------- births_of_~t | 5126 2334.671 13538.64 2 312506 births_of_~0 | 4795 150.1468 814.0767 2 17024 births_of_~1 | 3244 179.812 1087.646 2 22416 births_of_~2 | 2661 25.25066 138.5861 2 3784 . desc Contains data from natality1951.dta obs: 5,146 vars: 9 size: 416,826 (96.0% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page__of_pdf_ int %8.0g state str20 %20s county str33 %33s race str8 %8s births__total long %12.0g births_of_res~t long %12.0g births_of_res~0 int %8.0g births_of_res~1 int %8.0g births_of_res~2 int %8.0g ------------------------------------------------------------------------------ > - Sorted by: . replace county=lower(county) (5146 real changes made) . replace state=lower(state) (5146 real changes made) . replace race=lower(race) (5146 real changes made) . . *missing observations in the original data are zeros . *replace these so we can distinguish these from "true" missing values . *which would be generated below due to difficult-to-read characters . count if births__total==. 6 . replace births__total=0 if births__total ==. (6 real changes made) . count if births_of_residents_of_area__att==. 20 . replace births_of_residents_of_area__att =0 if births_of_residents_of_area > __att ==. (20 real changes made) . count if births_of_residents_of_area__at0==. 351 . replace births_of_residents_of_area__at0 =0 if births_of_residents_of_area > __at0 ==. (351 real changes made) . count if births_of_residents_of_area__at1==. 1902 . replace births_of_residents_of_area__at1 =0 if births_of_residents_of_area > __at1 ==. (1902 real changes made) . count if births_of_residents_of_area__at2==. 2485 . replace births_of_residents_of_area__at2 =0 if births_of_residents_of_area > __at2 ==. (2485 real changes made) . . *check that all pdf pages appear to be in the data . sort page__of_pdf_ . gen temp=page__of_pdf_[_n]-page__of_pdf_[_n-1] (1 missing value generated) . assert temp==0|temp==1|temp==. . drop temp . . *clean data entry errors . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="total" & births__total==5684 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="white" & births__total==3754 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="nonwhite" & births__total==1930 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="total" & births__total==5084 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="white" & births__total==2938 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="nonwhite" & births__total==2146 (1 real change made) . replace county="roanoke (ind. city)" if county=="roanoke" & state=="virginia > " & race=="total" & births__total==2210 (1 real change made) . . *fix county name misspellings (relative to county names in the 1970 census) . replace county=subinstr(county,"st.","st",.) (60 real changes made) . replace county=subinstr(county,"ste.","st",.) (1 real change made) . replace county="baltimore city" if county=="baltimore (city)" & state=="mary > land" (3 real changes made) . replace county="chisago" if county=="chicago" & state=="minnesota" (1 real change made) . replace county="cochrane" if county=="cochran" & state=="texas" (1 real change made) . replace county="desoto" if county=="de soto" & state=="florida" (3 real changes made) . replace county="st louis city" if county=="st louis (city)" & state=="missou > ri" (3 real changes made) . . *check that county names are consistent with 1970 census, except for known d > eviations . preserve . egen countyr_state=concat(county state), punct("_") . sort countyr_state . merge countyr_state using original_census1970_counties variable countyr_state does not uniquely identify observations in the master data variable countyr_state does not uniquely identify observations in original_census1970_counties.dta . assert _m==3|county=="armstrong" & state=="south dakota"|county=="washington > " & state=="south dakota"|county=="yellowstone national park (part)" & state > =="idaho"|county=="yellowstone national park" & state=="idaho"|county=="yell > owstone national park (part)" & state=="montana"|county=="yellowstone nation > al park (part)" & state=="wyoming"|county=="yellowstone national park (total > )" & state=="wyoming"|county=="yellowstone national park" & state=="montana" > |county=="park (excl yell nat park)" & state=="montana"|county=="yellowstone > national park, part" & state=="montana"|county=="yellowstone national park, > part" & state=="wyoming"|county=="yellowstone national park, part" & state= > ="montana"|county=="yellowstone national park, total" & state=="wyoming"|cou > nty=="park" & state=="montana"|county=="yellowstone nat. park (part)" & stat > e=="idaho"|county=="yellowstone nat. park (part)" & state=="wyoming"|county= > ="yellowstone nat. park (total)" & state=="wyoming"|county=="new york city" > & state=="new york"|county=="bronx" & state=="new york"|county=="kings" & st > ate=="new york"|county=="new york" & state=="new york"|county=="queens" & st > ate=="new york"|county=="richmond" & state=="new york"|county=="ormsby" & st > ate=="nevada"|county=="carson city" & state=="nevada"|county=="los alamos" & > state=="new mexico"|county=="menominee" & state=="wisconsin"|county=="total > "|state=="alaska"|state=="hawaii"|state=="virginia"|state=="dc" . restore . . *check for mistakes/misspellings in state names . assert state=="alabama"|state=="alaska"|state=="arizona"|state=="arkansas"|s > tate=="california"|state=="colorado"|state=="connecticut"|state=="delaware"| > state=="district of columbia"|state=="florida"|state=="georgia"|state=="hawa > ii"|state=="idaho"|state=="illinois"|state=="indiana"|state=="iowa"|state==" > kansas"|state=="kentucky"|state=="louisiana"|state=="maine"|state=="maryland > "|state=="massachusetts"|state=="michigan"|state=="minnesota"|state=="missis > sippi"|state=="missouri"|state=="montana"|state=="nebraska"|state=="nevada"| > state=="new hampshire"|state=="new jersey"|state=="new mexico"|state=="new y > ork"|state=="north carolina"|state=="north dakota"|state=="ohio"|state=="okl > ahoma"|state=="oregon"|state=="pennsylvania"|state=="rhode island"|state=="s > outh carolina"|state=="south dakota"|state=="tennessee"|state=="texas"|state > =="utah"|state=="vermont"|state=="virginia"|state=="washington"|state=="west > virginia"|state=="wisconsin"|state=="wyoming" . . *data checks for columns summing to county total . gen temp=births_of_residents_of_area__att+ births_of_residents_of_area__at0 > + births_of_residents_of_area__at1 + births_of_residents_of_area__at2 . assert temp==births__total . drop temp . . *clean and label variables . rename page__of_pdf_ page_of_pdf_ . label var page_of_pdf_ "page of pdf" . label var state "state" . label var county "county" . label var race "race" . rename births__total births . label var births "births by residence" . rename births_of_residents_of_area__att births_h_p . label var births_h_p "births by residence: physician in hospital" . rename births_of_residents_of_area__at0 births_nh_p . label var births_nh_p "births by residence: physician not in hospital" . rename births_of_residents_of_area__at1 births_m . label var births_m "births by residence: midwife" . rename births_of_residents_of_area__at2 births_o . label var births_o "births by residence: other and not specified" . . *generate year variable . gen year=1951 . label var year "year" . . *check that observations are unique . egen tag=tag(state county race) . assert tag==1 . drop tag . . list in 1/1 +--------------------------------------------------------------------+ 1. | page_o~_ | state | county | race | births | bir~_h_p | bir~nh_p | | 131 | alabama | total | total | 83736 | 54552 | 11102 | |--------------------------------------------------------------------| | births_m | births_o | year | | 16978 | 1104 | 1951 | +--------------------------------------------------------------------+ . desc Contains data from natality1951.dta obs: 5,146 vars: 10 size: 437,410 (95.8% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page_of_pdf_ int %8.0g page of pdf state str20 %20s state county str33 %33s county race str8 %8s race births long %12.0g births by residence births_h_p long %12.0g births by residence: physician in hospital births_nh_p int %8.0g births by residence: physician not in hospital births_m int %8.0g births by residence: midwife births_o int %8.0g births by residence: other and not specified year float %9.0g year ------------------------------------------------------------------------------ > - Sorted by: page_of_pdf_ Note: dataset has changed since last saved . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page_of_pdf_ | 5146 162.7919 18.41237 131 194 state | 0 county | 0 race | 0 births | 5146 2591.912 14296.77 0 316580 -------------+-------------------------------------------------------- births_h_p | 5146 2325.597 13513.08 0 312506 births_nh_p | 5146 139.9056 786.7287 0 17024 births_m | 5146 113.3521 867.8642 0 22416 births_o | 5146 13.05713 100.4437 0 3784 year | 5146 1951 0 1951 1951 . saveold clean_natality1951.dta,replace file clean_natality1951.dta saved . clear . . ** . *1952 data . ** . *http://nber15.nber.org/vital-stats-books/vsus_1952_1.cv.pdf . *table 18 . . *hand-checked excel file for ?'s (difficult-to-read characters) . *corrected below if any . . clear . use natality1952.dta . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page__of_p~_ | 5143 146.4359 17.21438 117 176 state | 0 county | 0 race | 0 births__to~l | 5141 2663.77 14755.15 2 326310 -------------+-------------------------------------------------------- births_of_~t | 5136 2437.879 14094.51 2 322366 births_of_~0 | 4672 122.7949 651.632 2 14896 births_of_~1 | 3127 172.2894 1032.493 2 20226 . desc Contains data from natality1952.dta obs: 5,143 vars: 8 size: 406,297 (96.1% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page__of_pdf_ int %8.0g state str20 %20s county str33 %33s race str8 %8s births__total long %12.0g births_of_res~t long %12.0g births_of_res~0 int %8.0g births_of_res~1 int %8.0g ------------------------------------------------------------------------------ > - Sorted by: . replace county=lower(county) (5143 real changes made) . replace state=lower(state) (5143 real changes made) . replace race=lower(race) (5143 real changes made) . . *missing observations in the original data are zeros . *replace these so we can distinguish these from "true" missing values . *which would be generated below due to difficult-to-read characters . count if births__total==. 2 . replace births__total=0 if births__total ==. (2 real changes made) . count if births_of_residents_of_area__att==. 7 . replace births_of_residents_of_area__att =0 if births_of_residents_of_area > __att ==. (7 real changes made) . count if births_of_residents_of_area__at0==. 471 . replace births_of_residents_of_area__at0 =0 if births_of_residents_of_area > __at0 ==. (471 real changes made) . count if births_of_residents_of_area__at1==. 2016 . replace births_of_residents_of_area__at1 =0 if births_of_residents_of_area > __at1 ==. (2016 real changes made) . . *check that all pdf pages appear to be in the data . sort page__of_pdf_ . gen temp=page__of_pdf_[_n]-page__of_pdf_[_n-1] (1 missing value generated) . assert temp==0|temp==1|temp==. . drop temp . . *clean data entry errors . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="total" & births__==6434 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="white" & births__total==4514 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="nonwhite" & births__total==1920 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="total" & births__total==5050 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="white" & births__total==2872 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="nonwhite" & births__total==2178 (1 real change made) . replace county="roanoke (ind. city)" if county=="roanoke" & state=="virginia > " & race=="total" & births__total==2312 (1 real change made) . . *fix county name misspellings (relative to county names in the 1970 census) . replace county=subinstr(county,"st.","st",.) (60 real changes made) . replace county=subinstr(county,"ste.","st",.) (1 real change made) . replace county="baltimore city" if county=="baltimore (independent city)" & > state=="maryland" (3 real changes made) . replace county="chisago" if county=="chicago" & state=="minnesota" (1 real change made) . replace county="cochrane" if county=="cochran" & state=="texas" (1 real change made) . replace county="desoto" if county=="de soto" & state=="florida" (3 real changes made) . replace county="schuyler" if county=="shuyler" & state=="missouri" (1 real change made) . replace county="st louis city" if county=="st louis (independent city)" & st > ate=="missouri" (3 real changes made) . . *correct data entry errors found while checking that county totals sum to st > ate totals . replace births__total=368 if births__total==366 & state=="colorado" > & county=="montrose" & race=="total" (1 real change made) . replace births__total=1448 if births__total==1446 & state=="oregon" > & county=="coos" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=834 if births_of_residents_of_ar > ea__att==934 & state=="louisiana" & county=="vernon" & race== > "total" (1 real change made) . . *check that county names are consistent with 1970 census, except for known d > eviations . preserve . egen countyr_state=concat(county state), punct("_") . sort countyr_state . merge countyr_state using original_census1970_counties variable countyr_state does not uniquely identify observations in the master data variable countyr_state does not uniquely identify observations in original_census1970_counties.dta . assert _m==3|county=="armstrong" & state=="south dakota"|county=="washington > " & state=="south dakota"|county=="yellowstone national park (part)" & state > =="idaho"|county=="yellowstone national park" & state=="idaho"|county=="yell > owstone national park (part)" & state=="montana"|county=="yellowstone nation > al park (part)" & state=="wyoming"|county=="yellowstone national park (total > )" & state=="wyoming"|county=="yellowstone national park" & state=="montana" > |county=="park (excl yell nat park)" & state=="montana"|county=="yellowstone > national park, part" & state=="montana"|county=="yellowstone national park, > part" & state=="wyoming"|county=="yellowstone national park, part" & state= > ="montana"|county=="yellowstone national park, total" & state=="wyoming"|cou > nty=="park" & state=="montana"|county=="yellowstone nat. park (part)" & stat > e=="idaho"|county=="yellowstone nat. park (part)" & state=="wyoming"|county= > ="yellowstone nat. park (total)" & state=="wyoming"|county=="new york city" > & state=="new york"|county=="bronx" & state=="new york"|county=="kings" & st > ate=="new york"|county=="new york" & state=="new york"|county=="queens" & st > ate=="new york"|county=="richmond" & state=="new york"|county=="ormsby" & st > ate=="nevada"|county=="carson city" & state=="nevada"|county=="los alamos" & > state=="new mexico"|county=="menominee" & state=="wisconsin"|county=="total > "|state=="alaska"|state=="hawaii"|state=="virginia"|state=="dc" . restore . . *check for mistakes/misspellings in state names . assert state=="alabama"|state=="alaska"|state=="arizona"|state=="arkansas"|s > tate=="california"|state=="colorado"|state=="connecticut"|state=="delaware"| > state=="district of columbia"|state=="florida"|state=="georgia"|state=="hawa > ii"|state=="idaho"|state=="illinois"|state=="indiana"|state=="iowa"|state==" > kansas"|state=="kentucky"|state=="louisiana"|state=="maine"|state=="maryland > "|state=="massachusetts"|state=="michigan"|state=="minnesota"|state=="missis > sippi"|state=="missouri"|state=="montana"|state=="nebraska"|state=="nevada"| > state=="new hampshire"|state=="new jersey"|state=="new mexico"|state=="new y > ork"|state=="north carolina"|state=="north dakota"|state=="ohio"|state=="okl > ahoma"|state=="oregon"|state=="pennsylvania"|state=="rhode island"|state=="s > outh carolina"|state=="south dakota"|state=="tennessee"|state=="texas"|state > =="utah"|state=="vermont"|state=="virginia"|state=="washington"|state=="west > virginia"|state=="wisconsin"|state=="wyoming" . . *data checks for columns summing to county total . *not applicable; see footnote 1 . . *clean and label variables . rename page__of_pdf_ page_of_pdf_ . label var page_of_pdf_ "page of pdf" . label var state "state" . label var county "county" . label var race "race" . rename births__total births . label var births "births by residence" . rename births_of_residents_of_area__att births_h_p . label var births_h_p "births by residence: physician in hospital" . rename births_of_residents_of_area__at0 births_nh_p . label var births_nh_p "births by residence: physician not in hospital" . rename births_of_residents_of_area__at1 births_m . label var births_m "births by residence: midwife" . . *generate year variable . gen year=1952 . label var year "year" . . *check that observations are unique . egen tag=tag(state county race) . assert tag==1 . drop tag . . list in 1/1 +--------------------------------------------------------------------+ 1. | page_o~_ | state | county | race | births | bir~_h_p | bir~nh_p | | 117 | alabama | total | total | 83140 | 57256 | 8752 | |--------------------------------------------------------------------| | births_m | year | | 16204 | 1952 | +--------------------------------------------------------------------+ . desc Contains data from natality1952.dta obs: 5,143 vars: 9 size: 426,869 (95.9% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page_of_pdf_ int %8.0g page of pdf state str20 %20s state county str33 %33s county race str8 %8s race births long %12.0g births by residence births_h_p long %12.0g births by residence: physician in hospital births_nh_p int %8.0g births by residence: physician not in hospital births_m int %8.0g births by residence: midwife year float %9.0g year ------------------------------------------------------------------------------ > - Sorted by: page_of_pdf_ Note: dataset has changed since last saved . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page_of_pdf_ | 5143 146.4359 17.21438 117 176 state | 0 county | 0 race | 0 births | 5143 2662.735 14752.37 0 326310 -------------+-------------------------------------------------------- births_h_p | 5143 2434.542 14085.2 0 322366 births_nh_p | 5143 111.5493 622.0803 0 14896 births_m | 5143 104.7538 809.4197 0 20226 year | 5143 1952 0 1952 1952 . saveold clean_natality1952.dta,replace file clean_natality1952.dta saved . clear . . ** . *1953 data . ** . *http://nber15.nber.org/vital-stats-books/vsus_1953_1.pdf . *table 18 . . *hand-checked excel file for ?'s (difficult-to-read characters) . *corrected below if any . . clear . use natality1953.dta . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page__of_p~_ | 8137 129.5459 28.25759 81 179 state | 0 county | 0 city_balan~l | 0 race | 0 -------------+-------------------------------------------------------- births__to~l | 8128 2185.491 12114.15 2 325278 births_of_~t | 8124 2029.526 11622.14 2 321608 births_of_~0 | 6998 82.49814 457.5203 2 13796 births_of_~1 | 4360 136.2149 839.1393 2 19298 . desc Contains data from natality1953.dta obs: 8,137 vars: 9 size: 1,057,810 (89.9% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page__of_pdf_ int %8.0g state str20 %20s county str52 %52s city_balance_~l str32 %32s race str8 %8s births__total long %12.0g births_of_res~t long %12.0g births_of_res~0 int %8.0g births_of_res~1 int %8.0g ------------------------------------------------------------------------------ > - Sorted by: . replace county=lower(county) (8137 real changes made) . replace state=lower(state) (8137 real changes made) . replace city_balance_total=lower(city_balance_total) (8137 real changes made) . replace race=lower(race) (8137 real changes made) . . *missing observations in the original data are zeros . *replace these so we can distinguish these from "true" missing values . *which would be generated below due to difficult-to-read characters . count if births__total==. 9 . replace births__total=0 if births__total ==. (9 real changes made) . count if births_of_residents_of_area__att==. 13 . replace births_of_residents_of_area__att =0 if births_of_residents_of_area > __att ==. (13 real changes made) . count if births_of_residents_of_area__at0==. 1139 . replace births_of_residents_of_area__at0 =0 if births_of_residents_of_area > __at0 ==. (1139 real changes made) . count if births_of_residents_of_area__at1==. 3777 . replace births_of_residents_of_area__at1 =0 if births_of_residents_of_area > __at1 ==. (3777 real changes made) . . *check that all pdf pages appear to be in the data . sort page__of_pdf_ . gen temp=page__of_pdf_[_n]-page__of_pdf_[_n-1] (1 missing value generated) . assert temp==0|temp==1|temp==. . drop temp . . *clean data entry errors . replace city_balance_total="jersey city" if state=="new jersey" & county=="h > udson" & race=="white" & births__total==5532 (1 real change made) . replace city_balance_total="jersey city" if state=="new jersey" & county=="h > udson" & race=="nonwhite" & births__total==802 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="total" & births__total==6668 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="white" & births__total==4722 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="nonwhite" & births__total==1946 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="total" & births__total==5040 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="white" & births__total==2828 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="nonwhite" & births__total==2212 (1 real change made) . replace county="roanoke (ind. city)" if county=="roanoke" & state=="virginia > " & race=="total" & births__total==2288 (1 real change made) . . *fix county name misspellings (relative to county names in the 1970 census) . replace county=subinstr(county,"st.","st",.) (110 real changes made) . replace county=subinstr(county,"ste.","st",.) (1 real change made) . replace county="baltimore city" if county=="baltimore (city)" & state=="mary > land" (3 real changes made) . replace county="chisago" if county=="chicago" & state=="minnesota" (1 real change made) . replace county="cochrane" if county=="cochran" & state=="texas" (1 real change made) . replace county="desoto" if county=="de soto" & state=="florida" (3 real changes made) . replace county="denver" if county=="denver, coextensive with denver (city)" > & state=="colorado" (1 real change made) . replace county="orleans" if county=="orleans, coextensive with new orleans" > & state=="louisiana" (3 real changes made) . replace county="philadelphia" if county=="philadelphia, coextensive with phi > ladelphia (city)" & state=="pennsylvania" (3 real changes made) . replace county="san francisco" if county=="san francisco, coextensive with s > an francisco (city)" & state=="california" (3 real changes made) . replace county="st louis city" if county=="st louis (city)" & state=="missou > ri" (3 real changes made) . . *correct data entry errors found while checking that county totals sum to st > ate totals . replace births__total=1086 if births__total==1096 & state=="missouri" > & county=="dunklin" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=360 if births_of_residents_of_ar > ea__att==380 & state=="west virginia" & county=="putnam" > & city_balance_total=="total" & race=="total" (1 real change made) . . *correct data entry errors found while checking that white+nonwhite=total . replace births__total=402 if births__total==420 & state=="kentucky" > & county=="logan" & race=="white" & city_balance_total=="total" (1 real change made) . replace births_of_residents_of_area__att=50084 if births_of_residents_of_ar > ea__att==5084 & state=="louisiana" & county=="total" & race== > "white" & city_balance_total=="total" (1 real change made) . . *check that county names are consistent with 1970 census, except for known d > eviations . *the 1additional contradictions are counties in massachusetts, since data fo > r massachusetts is only shown for the state as a whole in the year 1953, as > specified in footnote 8 on page 179 . preserve . egen countyr_state=concat(county state), punct("_") . sort countyr_state . merge countyr_state using original_census1970_counties variable countyr_state does not uniquely identify observations in the master data variable countyr_state does not uniquely identify observations in original_census1970_counties.dta . assert _m==3|county=="armstrong" & state=="south dakota"|county=="washington > " & state=="south dakota"|county=="yellowstone national park (part)" & state > =="idaho"|county=="yellowstone national park" & state=="idaho"|county=="yell > owstone national park (part)" & state=="montana"|county=="yellowstone nation > al park (part)" & state=="wyoming"|county=="yellowstone national park (total > )" & state=="wyoming"|county=="yellowstone national park" & state=="montana" > |county=="park (excl yell nat park)" & state=="montana"|county=="yellowstone > national park, part" & state=="montana"|county=="yellowstone national park, > part" & state=="wyoming"|county=="yellowstone national park, part" & state= > ="montana"|county=="yellowstone national park, total" & state=="wyoming"|cou > nty=="park" & state=="montana"|county=="yellowstone nat. park (part)" & stat > e=="idaho"|county=="yellowstone nat. park (part)" & state=="wyoming"|county= > ="yellowstone nat. park (total)" & state=="wyoming"|county=="new york city" > & state=="new york"|county=="bronx" & state=="new york"|county=="kings" & st > ate=="new york"|county=="new york" & state=="new york"|county=="queens" & st > ate=="new york"|county=="richmond" & state=="new york"|county=="ormsby" & st > ate=="nevada"|county=="carson city" & state=="nevada"|county=="los alamos" & > state=="new mexico"|county=="menominee" & state=="wisconsin"|county=="total > "|state=="alaska"|state=="hawaii"|state=="virginia"|state=="dc"|state=="mass > achusetts" . restore . . *check for mistakes/misspellings in state names . assert state=="alabama"|state=="alaska"|state=="arizona"|state=="arkansas"|s > tate=="california"|state=="colorado"|state=="connecticut"|state=="delaware"| > state=="district of columbia"|state=="florida"|state=="georgia"|state=="hawa > ii"|state=="idaho"|state=="illinois"|state=="indiana"|state=="iowa"|state==" > kansas"|state=="kentucky"|state=="louisiana"|state=="maine"|state=="maryland > "|state=="massachusetts"|state=="michigan"|state=="minnesota"|state=="missis > sippi"|state=="missouri"|state=="montana"|state=="nebraska"|state=="nevada"| > state=="new hampshire"|state=="new jersey"|state=="new mexico"|state=="new y > ork"|state=="north carolina"|state=="north dakota"|state=="ohio"|state=="okl > ahoma"|state=="oregon"|state=="pennsylvania"|state=="rhode island"|state=="s > outh carolina"|state=="south dakota"|state=="tennessee"|state=="texas"|state > =="utah"|state=="vermont"|state=="virginia"|state=="washington"|state=="west > virginia"|state=="wisconsin"|state=="wyoming" . . *data checks for columns summing to county total . *not applicable; see footnote 1 . . *clean and label variables . rename page__of_pdf_ page_of_pdf_ . label var page_of_pdf_ "page of pdf" . label var state "state" . label var county "county" . rename city_balance_total sub_county . label var sub_county "city/balance/total" . label var race "race" . rename births__total births . label var births "births by residence" . rename births_of_residents_of_area__att births_h_p . label var births_h_p "births by residence: physician in hospital" . rename births_of_residents_of_area__at0 births_nh_p . label var births_nh_p "births by residence: physician not in hospital" . rename births_of_residents_of_area__at1 births_m . label var births_m "births by residence: midwife" . . *generate year variable . gen year=1953 . label var year "year" . . *check that observations are unique . egen tag=tag(state county sub_county race) . assert tag==1 . drop tag . . list in 1/1 +--------------------------------------------------------------------+ 1. | page_o~_ | state | county | sub_co~y | race | births | bir~_h_p | | 81 | alabama | total | total | total | 82648 | 59804 | |--------------------------------------------------------------------| | bir~nh_p | births_m | year | | 7158 | 14964 | 1953 | +--------------------------------------------------------------------+ . desc Contains data from natality1953.dta obs: 8,137 vars: 10 size: 1,090,358 (89.6% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page_of_pdf_ int %8.0g page of pdf state str20 %20s state county str52 %52s county sub_county str32 %32s city/balance/total race str8 %8s race births long %12.0g births by residence births_h_p long %12.0g births by residence: physician in hospital births_nh_p int %8.0g births by residence: physician not in hospital births_m int %8.0g births by residence: midwife year float %9.0g year ------------------------------------------------------------------------------ > - Sorted by: page_of_pdf_ Note: dataset has changed since last saved . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page_of_pdf_ | 8137 129.5459 28.25759 81 179 state | 0 county | 0 sub_county | 0 race | 0 -------------+-------------------------------------------------------- births | 8137 2183.07 12107.67 0 325278 births_h_p | 8137 2031.811 11625.3 0 321608 births_nh_p | 8137 70.95023 425.2527 0 13796 births_m | 8137 72.98722 617.9629 0 19298 year | 8137 1953 0 1953 1953 . saveold clean_natality1953.dta,replace file clean_natality1953.dta saved . clear . . ** . *1954 data . ** . *http://nber15.nber.org/vital-stats-books/vsus_1954_1.cv.pdf . *table 18 . . *hand-checked excel file for ?'s (difficult-to-read characters) . *corrected below if any . . clear . use natality1954.dta . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page__of_p~_ | 8137 129.5459 28.25759 81 179 state | 0 county | 0 city_balan~l | 0 race | 0 -------------+-------------------------------------------------------- births__to~l | 8128 2253.965 12488.75 2 335510 births_of_~t | 8126 2114.307 12048.1 2 331644 births_of_~0 | 6815 74.719 405.5955 2 11816 births_of_~1 | 4290 134.5636 822.8453 2 18982 . desc Contains data from natality1954.dta obs: 8,137 vars: 9 size: 1,057,810 (89.9% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page__of_pdf_ int %8.0g state str20 %20s county str52 %52s city_balance_~l str32 %32s race str8 %8s births__total long %12.0g births_of_res~t long %12.0g births_of_res~0 int %8.0g births_of_res~1 int %8.0g ------------------------------------------------------------------------------ > - Sorted by: . replace county=lower(county) (8137 real changes made) . replace state=lower(state) (8137 real changes made) . replace city_balance_total=lower(city_balance_total) (8137 real changes made) . replace race=lower(race) (8137 real changes made) . . *missing observations in the original data are zeros . *replace these so we can distinguish these from "true" missing values . *which would be generated below due to difficult-to-read characters . count if births__total==. 9 . replace births__total=0 if births__total ==. (9 real changes made) . count if births_of_residents_of_area__att==. 11 . replace births_of_residents_of_area__att =0 if births_of_residents_of_area > __att ==. (11 real changes made) . count if births_of_residents_of_area__at0==. 1322 . replace births_of_residents_of_area__at0 =0 if births_of_residents_of_area > __at0 ==. (1322 real changes made) . count if births_of_residents_of_area__at1==. 3847 . replace births_of_residents_of_area__at1 =0 if births_of_residents_of_area > __at1 ==. (3847 real changes made) . . *check that all pdf pages appear to be in the data . sort page__of_pdf_ . gen temp=page__of_pdf_[_n]-page__of_pdf_[_n-1] (1 missing value generated) . assert temp==0|temp==1|temp==. . drop temp . . *clean data entry errors . replace city_balance_total="jersey city" if state=="new jersey" & county=="h > udson" & race=="white" & births__total==5562 (1 real change made) . replace city_balance_total="jersey city" if state=="new jersey" & county=="h > udson" & race=="nonwhite" & births__total==914 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="total" & births__total==6688 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="white" & births__total==4614 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="nonwhite" & births__total==2074 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="total" & births__total==5178 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="white" & births__total==2622 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="nonwhite" & births__total==2556 (1 real change made) . replace county="roanoke (ind. city)" if county=="roanoke" & state=="virginia > " & race=="total" & births__total==2226 (1 real change made) . . *fix county name misspellings (relative to county names in the 1970 census) . replace county=subinstr(county,"st.","st",.) (110 real changes made) . replace county=subinstr(county,"ste.","st",.) (1 real change made) . replace county="baltimore city" if county=="baltimore (city)" & state=="mary > land" (3 real changes made) . replace county="chisago" if county=="chicago" & state=="minnesota" (1 real change made) . replace county="cochrane" if county=="cochran" & state=="texas" (1 real change made) . replace county="desoto" if county=="de soto" & state=="florida" (3 real changes made) . replace county="denver" if county=="denver, coextensive with denver (city)" > & state=="colorado" (1 real change made) . replace county="orleans" if county=="orleans, coextensive with new orleans" > & state=="louisiana" (3 real changes made) . replace county="philadelphia" if county=="philadelphia, coextensive with phi > ladelphia (city)" & state=="pennsylvania" (3 real changes made) . replace county="san francisco" if county=="san francisco, coextensive with s > an francisco (city)" & state=="california" (3 real changes made) . replace county="st louis city" if county=="st louis (city)" & state=="missou > ri" (3 real changes made) . . *check that county names are consistent with 1970 census, except for known d > eviations . *the additional contradictions are counties in massachusetts, since data for > massachusetts is only shown for the state as a whole in the year 1954, as s > pecified in footnote 8 on page 179 . preserve . egen countyr_state=concat(county state), punct("_") . sort countyr_state . merge countyr_state using original_census1970_counties variable countyr_state does not uniquely identify observations in the master data variable countyr_state does not uniquely identify observations in original_census1970_counties.dta . assert _m==3|county=="armstrong" & state=="south dakota"|county=="washington > " & state=="south dakota"|county=="yellowstone national park (part)" & state > =="idaho"|county=="yellowstone national park" & state=="idaho"|county=="yell > owstone national park (part)" & state=="montana"|county=="yellowstone nation > al park (part)" & state=="wyoming"|county=="yellowstone national park (total > )" & state=="wyoming"|county=="yellowstone national park" & state=="montana" > |county=="park (excl yell nat park)" & state=="montana"|county=="yellowstone > national park, part" & state=="montana"|county=="yellowstone national park, > part" & state=="wyoming"|county=="yellowstone national park, part" & state= > ="montana"|county=="yellowstone national park, total" & state=="wyoming"|cou > nty=="park" & state=="montana"|county=="yellowstone nat. park (part)" & stat > e=="idaho"|county=="yellowstone nat. park (part)" & state=="wyoming"|county= > ="yellowstone nat. park (total)" & state=="wyoming"|county=="new york city" > & state=="new york"|county=="bronx" & state=="new york"|county=="kings" & st > ate=="new york"|county=="new york" & state=="new york"|county=="queens" & st > ate=="new york"|county=="richmond" & state=="new york"|county=="ormsby" & st > ate=="nevada"|county=="carson city" & state=="nevada"|county=="los alamos" & > state=="new mexico"|county=="menominee" & state=="wisconsin"|county=="total > "|state=="alaska"|state=="hawaii"|state=="virginia"|state=="dc"|state=="mass > achusetts" . restore . . *check for mistakes/misspellings in state names . assert state=="alabama"|state=="alaska"|state=="arizona"|state=="arkansas"|s > tate=="california"|state=="colorado"|state=="connecticut"|state=="delaware"| > state=="district of columbia"|state=="florida"|state=="georgia"|state=="hawa > ii"|state=="idaho"|state=="illinois"|state=="indiana"|state=="iowa"|state==" > kansas"|state=="kentucky"|state=="louisiana"|state=="maine"|state=="maryland > "|state=="massachusetts"|state=="michigan"|state=="minnesota"|state=="missis > sippi"|state=="missouri"|state=="montana"|state=="nebraska"|state=="nevada"| > state=="new hampshire"|state=="new jersey"|state=="new mexico"|state=="new y > ork"|state=="north carolina"|state=="north dakota"|state=="ohio"|state=="okl > ahoma"|state=="oregon"|state=="pennsylvania"|state=="rhode island"|state=="s > outh carolina"|state=="south dakota"|state=="tennessee"|state=="texas"|state > =="utah"|state=="vermont"|state=="virginia"|state=="washington"|state=="west > virginia"|state=="wisconsin"|state=="wyoming" . . *data checks for columns summing to county total . *not applicable; see footnote 1 . . *clean and label variables . rename page__of_pdf_ page_of_pdf_ . label var page_of_pdf_ "page of pdf" . label var state "state" . label var county "county" . rename city_balance_total sub_county . label var sub_county "city/balance/total" . label var race "race" . rename births__total births . label var births "births by residence" . rename births_of_residents_of_area__att births_h_p . label var births_h_p "births by residence: physician in hospital" . rename births_of_residents_of_area__at0 births_nh_p . label var births_nh_p "births by residence: physician not in hospital" . rename births_of_residents_of_area__at1 births_m . label var births_m "births by residence: midwife" . . *generate year variable . gen year=1954 . label var year "year" . . *check that observations are unique . egen tag=tag(state county sub_county race) . assert tag==1 . drop tag . . list in 1/1 +--------------------------------------------------------------------+ 1. | page_o~_ | state | county | sub_co~y | race | births | bir~_h_p | | 81 | alabama | total | total | total | 82458 | 60940 | |--------------------------------------------------------------------| | bir~nh_p | births_m | year | | 6108 | 14840 | 1954 | +--------------------------------------------------------------------+ . desc Contains data from natality1954.dta obs: 8,137 vars: 10 size: 1,090,358 (89.6% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page_of_pdf_ int %8.0g page of pdf state str20 %20s state county str52 %52s county sub_county str32 %32s city/balance/total race str8 %8s race births long %12.0g births by residence births_h_p long %12.0g births by residence: physician in hospital births_nh_p int %8.0g births by residence: physician not in hospital births_m int %8.0g births by residence: midwife year float %9.0g year ------------------------------------------------------------------------------ > - Sorted by: page_of_pdf_ Note: dataset has changed since last saved . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page_of_pdf_ | 8137 129.5459 28.25759 81 179 state | 0 county | 0 sub_county | 0 race | 0 -------------+-------------------------------------------------------- births | 8137 2251.472 12482.06 0 335510 births_h_p | 8137 2111.449 12040.21 0 331644 births_nh_p | 8137 62.57957 372.2056 0 11816 births_m | 8137 70.94482 601.201 0 18982 year | 8137 1954 0 1954 1954 . saveold clean_natality1954.dta,replace file clean_natality1954.dta saved . clear . . ** . *1955 data . ** . *http://nber15.nber.org/vital-stats-books/vsus_1955_1.cv.pdf . *table 19 . . *hand-checked excel file for ?'s (difficult-to-read characters) . *corrected below if any . . clear . use natality1955.dta . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page__of_p~_ | 8259 132.4628 28.87426 83 183 state | 0 county | 0 city_balan~l | 0 race | 0 -------------+-------------------------------------------------------- births__to~l | 8252 2273.767 12522.66 1 342566 births_of_~t | 8251 2150.823 12129.66 1 338709 births_of_~0 | 7381 59.88741 336.9972 1 10216 births_of_~1 | 4601 115.2615 741.628 1 18308 . desc Contains data from natality1955.dta obs: 8,259 vars: 9 size: 1,073,670 (89.8% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page__of_pdf_ int %8.0g state str20 %20s county str52 %52s city_balance_~l str32 %32s race str8 %8s births__total long %12.0g births_of_res~t long %12.0g births_of_res~0 int %8.0g births_of_res~1 int %8.0g ------------------------------------------------------------------------------ > - Sorted by: . replace county=lower(county) (8259 real changes made) . replace state=lower(state) (8259 real changes made) . replace city_balance_total=lower(city_balance_total) (8259 real changes made) . replace race=lower(race) (8259 real changes made) . . *missing observations in the original data are zeros . *replace these so we can distinguish these from "true" missing values . *which would be generated below due to difficult-to-read characters . count if births__total==. 7 . replace births__total=0 if births__total ==. (7 real changes made) . count if births_of_residents_of_area__att==. 8 . replace births_of_residents_of_area__att =0 if births_of_residents_of_area > __att ==. (8 real changes made) . count if births_of_residents_of_area__at0==. 878 . replace births_of_residents_of_area__at0 =0 if births_of_residents_of_area > __at0 ==. (878 real changes made) . count if births_of_residents_of_area__at1==. 3658 . replace births_of_residents_of_area__at1 =0 if births_of_residents_of_area > __at1 ==. (3658 real changes made) . . *check that all pdf pages appear to be in the data . sort page__of_pdf_ . gen temp=page__of_pdf_[_n]-page__of_pdf_[_n-1] (1 missing value generated) . assert temp==0|temp==1|temp==. . drop temp . . *clean data entry errors . replace births_of_residents_of_area__att=299 if state=="illinois" & county== > "monroe" & city_balance_total=="total" & race=="total" (1 real change made) . replace race="white" if state=="arkansas" & county=="union" & births__total= > =333 (1 real change made) . replace race="nonwhite" if state=="arkansas" & county=="union" & births__tot > al==270 (1 real change made) . replace city_balance_total="total" if state=="arkansas" & county=="van buren > " (1 real change made) . replace race="total" if state=="arkansas" & county=="van buren" (1 real change made) . replace city_balance_total="total" if state=="arkansas" & county=="washingto > n" & births__total==1046 (1 real change made) . replace race="total" if state=="arkansas" & county=="washington" & city_bala > nce_total=="total" (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="total" & births__total==8325 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="white" & births__total==5985 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="nonwhite" & births__total==2340 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="total" & births__total==5133 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="white" & births__total==2579 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="nonwhite" & births__total==2554 (1 real change made) . replace county="roanoke (ind. city)" if county=="roanoke" & state=="virginia > " & race=="total" & births__total==2321 (1 real change made) . . *fix county name misspellings (relative to county names in the 1970 census) . replace county=subinstr(county,"st.","st",.) (110 real changes made) . replace county=subinstr(county,"ste.","st",.) (1 real change made) . replace county="baltimore city" if county=="baltimore (city)" & state=="mary > land" (3 real changes made) . replace county="cochrane" if county=="cochran" & state=="texas" (1 real change made) . replace county="desoto" if county=="de soto" & state=="florida" (3 real changes made) . replace county="de kalb" if county=="dekalb" & state=="tennessee" (1 real change made) . replace county="denver" if county=="denver, coextensive with denver (city)" > & state=="colorado" (1 real change made) . replace county="orleans" if county=="orleans, coextensive with new orleans" > & state=="louisiana" (3 real changes made) . replace county="philadelphia" if county=="philadelphia, coextensive with phi > ladelphia (city)" & state=="pennsylvania" (3 real changes made) . replace county="san francisco" if county=="san francisco, coextensive with s > an francisco (city)" & state=="california" (3 real changes made) . replace county="st louis city" if county=="st louis (city)" & state=="missou > ri" (3 real changes made) . . *correct data entry errors found while checking that county totals sum to st > ate totals . replace births__total=2196 if births__total==2195 & state=="new york" > & county=="ulster" & city_balance_total=="total" & > race=="total" (1 real change made) . replace births__total=275 if births__total==273 & state=="north dako > ta" & county=="emmons" & city_balance_total=="total" & race=="t > otal" (1 real change made) . replace births__total=316 if births__total==318 & state=="utah" > & county=="sanpete" & city_balance_total=="total" & > race=="total" (1 real change made) . replace births__total=348 if births__total==349 & state=="west virgi > nia" & county=="barbour" & city_balance_total=="total" & race=="total" (1 real change made) . replace births__total=881 if births__total==681 & state=="wisconsin" > & county=="columbia" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=221 if births_of_residents_of_ar > ea__att==220 & state=="florida" & county=="holmes" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=279 if births_of_residents_of_ar > ea__att==379 & state=="indiana" & county=="orange" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=482 if births_of_residents_of_ar > ea__att==282 & state=="iowa" & county=="buchanan" & city_b > alance_total=="total" & race=="total" (1 real change made) . . *check that county names are consistent with 1970 census, except for known d > eviations . preserve . egen countyr_state=concat(county state), punct("_") . sort countyr_state . merge countyr_state using original_census1970_counties variable countyr_state does not uniquely identify observations in the master data variable countyr_state does not uniquely identify observations in original_census1970_counties.dta . assert _m==3|county=="armstrong" & state=="south dakota"|county=="washington > " & state=="south dakota"|county=="yellowstone national park (part)" & state > =="idaho"|county=="yellowstone national park" & state=="idaho"|county=="yell > owstone national park (part)" & state=="montana"|county=="yellowstone nation > al park (part)" & state=="wyoming"|county=="yellowstone national park (total > )" & state=="wyoming"|county=="yellowstone national park" & state=="montana" > |county=="park (excl yell nat park)" & state=="montana"|county=="yellowstone > national park, part" & state=="montana"|county=="yellowstone national park, > part" & state=="wyoming"|county=="yellowstone national park, part" & state= > ="montana"|county=="yellowstone national park, total" & state=="wyoming"|cou > nty=="park" & state=="montana"|county=="yellowstone nat. park (part)" & stat > e=="idaho"|county=="yellowstone nat. park (part)" & state=="wyoming"|county= > ="yellowstone nat. park (total)" & state=="wyoming"|county=="new york city" > & state=="new york"|county=="bronx" & state=="new york"|county=="kings" & st > ate=="new york"|county=="new york" & state=="new york"|county=="queens" & st > ate=="new york"|county=="richmond" & state=="new york"|county=="ormsby" & st > ate=="nevada"|county=="carson city" & state=="nevada"|county=="los alamos" & > state=="new mexico"|county=="menominee" & state=="wisconsin"|county=="total > "|state=="alaska"|state=="hawaii"|state=="virginia"|state=="dc" . restore . . *check for mistakes/misspellings in state names . assert state=="alabama"|state=="alaska"|state=="arizona"|state=="arkansas"|s > tate=="california"|state=="colorado"|state=="connecticut"|state=="delaware"| > state=="district of columbia"|state=="florida"|state=="georgia"|state=="hawa > ii"|state=="idaho"|state=="illinois"|state=="indiana"|state=="iowa"|state==" > kansas"|state=="kentucky"|state=="louisiana"|state=="maine"|state=="maryland > "|state=="massachusetts"|state=="michigan"|state=="minnesota"|state=="missis > sippi"|state=="missouri"|state=="montana"|state=="nebraska"|state=="nevada"| > state=="new hampshire"|state=="new jersey"|state=="new mexico"|state=="new y > ork"|state=="north carolina"|state=="north dakota"|state=="ohio"|state=="okl > ahoma"|state=="oregon"|state=="pennsylvania"|state=="rhode island"|state=="s > outh carolina"|state=="south dakota"|state=="tennessee"|state=="texas"|state > =="utah"|state=="vermont"|state=="virginia"|state=="washington"|state=="west > virginia"|state=="wisconsin"|state=="wyoming" . . *data checks for columns summing to county total . *not applicable; see footnote 1 . . *clean and label variables . rename page__of_pdf_ page_of_pdf_ . label var page_of_pdf_ "page of pdf" . label var state "state" . label var county "county" . rename city_balance_total sub_county . label var sub_county "city/balance/total" . label var race "race" . rename births__total births . label var births "births by residence" . rename births_of_residents_of_area__att births_h_p . label var births_h_p "births by residence: physician in hospital" . rename births_of_residents_of_area__at0 births_nh_p . label var births_nh_p "births by residence: physician not in hospital" . rename births_of_residents_of_area__at1 births_m . label var births_m "births by residence: midwife" . . *generate year variable . gen year=1955 . label var year "year" . . *check that observations are unique . egen tag=tag(state county sub_county race) . assert tag==1 . drop tag . . list in 1/1 +--------------------------------------------------------------------+ 1. | page_o~_ | state | county | sub_co~y | race | births | bir~_h_p | | 83 | alabama | total | total | total | 81811 | 62190 | |--------------------------------------------------------------------| | bir~nh_p | births_m | year | | 4941 | 14295 | 1955 | +--------------------------------------------------------------------+ . desc Contains data from natality1955.dta obs: 8,259 vars: 10 size: 1,106,706 (89.4% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page_of_pdf_ int %8.0g page of pdf state str20 %20s state county str52 %52s county sub_county str32 %32s city/balance/total race str8 %8s race births long %12.0g births by residence births_h_p long %12.0g births by residence: physician in hospital births_nh_p int %8.0g births by residence: physician not in hospital births_m int %8.0g births by residence: midwife year float %9.0g year ------------------------------------------------------------------------------ > - Sorted by: page_of_pdf_ Note: dataset has changed since last saved . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page_of_pdf_ | 8259 132.4628 28.87426 83 183 state | 0 county | 0 sub_county | 0 race | 0 -------------+-------------------------------------------------------- births | 8259 2271.865 12517.52 0 342566 births_h_p | 8259 2148.788 12123.96 0 338709 births_nh_p | 8259 53.52089 319.1133 0 10216 births_m | 8259 64.21092 556.4663 0 18308 year | 8259 1955 0 1955 1955 . saveold clean_natality1955.dta,replace file clean_natality1955.dta saved . clear . . ** . *1956 data . ** . *http://nber15.nber.org/vital-stats-books/vsus_1956_1.cv.pdf . *table 19 . . *hand-checked excel file for ?'s (difficult-to-read characters) . *corrected below if any . . clear . use natality1956.dta . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page__of_p~_ | 8143 133.695 29.52136 83 185 state | 0 county | 0 city_balan~l | 0 race | 0 -------------+-------------------------------------------------------- births__to~l | 8130 2344.8 13003.93 2 346938 births_of_~t | 8126 2233.916 12655.18 2 343274 births_of_~0 | 6497 58.19055 308.7752 2 9042 births_of_~1 | 3909 125.979 757.1551 2 17108 . desc Contains data from natality1956.dta obs: 8,143 vars: 9 size: 1,058,590 (89.9% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page__of_pdf_ int %8.0g state str20 %20s county str52 %52s city_balance_~l str32 %32s race str8 %8s births__total long %12.0g births_of_res~t long %12.0g births_of_res~0 int %8.0g births_of_res~1 int %8.0g ------------------------------------------------------------------------------ > - Sorted by: . replace county=lower(county) (8143 real changes made) . replace state=lower(state) (8143 real changes made) . replace city_balance_total=lower(city_balance_total) (8143 real changes made) . replace race=lower(race) (8143 real changes made) . . *missing observations in the original data are zeros . *replace these so we can distinguish these from "true" missing values . *which would be generated below due to difficult-to-read characters . count if births__total==. 13 . replace births__total=0 if births__total ==. (13 real changes made) . count if births_of_residents_of_area__att==. 17 . replace births_of_residents_of_area__att =0 if births_of_residents_of_area > __att ==. (17 real changes made) . count if births_of_residents_of_area__at0==. 1646 . replace births_of_residents_of_area__at0 =0 if births_of_residents_of_area > __at0 ==. (1646 real changes made) . count if births_of_residents_of_area__at1==. 4234 . replace births_of_residents_of_area__at1 =0 if births_of_residents_of_area > __at1 ==. (4234 real changes made) . . *check that all pdf pages appear to be in the data . sort page__of_pdf_ . gen temp=page__of_pdf_[_n]-page__of_pdf_[_n-1] (1 missing value generated) . assert temp==0|temp==1|temp==. . drop temp . . *clean data entry errors . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="total" & births__total==8320 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="white" & births__total==5844 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="nonwhite" & births__total==2476 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="total" & births__total==5128 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="white" & births__total==2470 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="nonwhite" & births__total==2658 (1 real change made) . replace county="roanoke (ind. city)" if county=="roanoke" & state=="virginia > " & race=="total" & births__total==2368 (1 real change made) . replace city_balance_total="cairo" if county=="cairo" & births__total==230 (1 real change made) . replace city_balance_total="cairo" if county=="cairo" & births__total==140 & > births_of_residents_of_area__att==130 (1 real change made) . replace city_balance_total="cairo" if county=="cairo" & births__total==90 (1 real change made) . replace county="alexander" if city_balance_total=="cairo" (3 real changes made) . replace county="alexander" if state=="illinois" & race=="total" & births__to > tal==140 & births_of_residents_of_area__att==88 (1 real change made) . replace county="alexander" if state=="illinois" & race=="white" & births__to > tal==80 (1 real change made) . replace county="alexander" if state=="illinois" & race=="nonwhite" & births_ > _total==60 (1 real change made) . replace county="liberty" if state=="florida" & city_balance_total =="total" > & race =="white" & births__total ==64 & births_of_residents_of_area__att > ==56 (1 real change made) . replace county="liberty" if state=="florida" & city_balance_total =="total" > & race =="nonwhite" & births__total ==20 & births_of_residents_of_area__ > att ==6 (1 real change made) . replace county="liberty" if state=="florida" & city_balance_total =="total" > & race =="total" & births__total ==84 & births_of_residents_of_area__att > ==62 (1 real change made) . replace state="montana" if state=="missouri" & county=="liberty" (1 real change made) . replace state="montana" if state=="missouri" & county=="park" (1 real change made) . replace county="thomas" if state=="georgia" & county=="thomasville" & city_b > alance_total=="balance of county" (3 real changes made) . replace county="thomas" if state=="georgia" & county=="thomasville" & city_b > alance_total=="total" (3 real changes made) . replace city_balance_total="thomasville" if state=="georgia" & county=="thom > as" & city_balance_total=="total" & race=="total" & births__total==536 (1 real change made) . replace city_balance_total="thomasville" if state=="georgia" & county=="thom > as" & city_balance_total=="total" & race=="white" & births__total==280 (1 real change made) . replace city_balance_total="thomasville" if state=="georgia" & county=="thom > as" & city_balance_total=="total" & race=="nonwhite" & births__total==256 (1 real change made) . replace city_balance_total="indianapolis" if state=="indiana" & county=="mar > ion" & city_balance_total=="total" & births__total==10084 (1 real change made) . replace city_balance_total="indianapolis" if state=="indiana" & county=="mar > ion" & city_balance_total=="total" & births__total==3072 (1 real change made) . replace city_balance_total="independence" if state=="kansas" & county=="mont > gomery" & city_balance_total=="total" & race=="white" & births__total==210 (1 real change made) . replace city_balance_total="independence" if state=="kansas" & county=="mont > gomery" & city_balance_total=="total" & race=="nonwhite" & births__total==48 (1 real change made) . replace city_balance_total="balance of county" if state=="maryland" & county > =="montgomery" & city_balance_total=="total" & race=="white" & births__total > ==6366 (1 real change made) . replace city_balance_total="balance of county" if state=="maryland" & county > =="montgomery" & city_balance_total=="total" & race=="nonwhite" & births__to > tal==396 (1 real change made) . replace city_balance_total="balance of county" if state=="mississippi" & cou > nty=="adams" & city_balance_total=="total" & race=="white" & births__total== > 172 (1 real change made) . replace city_balance_total="balance of county" if state=="mississippi" & cou > nty=="adams" & city_balance_total=="total" & race=="nonwhite" & births__tota > l==208 (1 real change made) . replace city_balance_total="balance of county" if state=="mississippi" & cou > nty=="coahoma" & city_balance_total=="total" & race=="white" & births__total > ==124 (1 real change made) . replace city_balance_total="balance of county" if state=="mississippi" & cou > nty=="coahoma" & city_balance_total=="total" & race=="nonwhite" & births__to > tal==912 (1 real change made) . replace city_balance_total="balance of county" if state=="mississippi" & cou > nty=="forrest" & city_balance_total=="total" & race=="white" & births__total > ==338 (1 real change made) . replace city_balance_total="balance of county" if state=="mississippi" & cou > nty=="forrest" & city_balance_total=="total" & race=="nonwhite" & births__to > tal==138 (1 real change made) . replace city_balance_total="balance of county" if state=="mississippi" & cou > nty=="hinds" & city_balance_total=="total" & race=="white" & births__total== > 962 (1 real change made) . replace city_balance_total="balance of county" if state=="mississippi" & cou > nty=="hinds" & city_balance_total=="total" & race=="nonwhite" & births__tota > l==892 (1 real change made) . replace state="montana" if state=="missouri" & county=="total" & city_balanc > e_total=="total" & race=="total" & births__total==17732 (1 real change made) . replace state="montana" if state=="missouri" & county=="total" & city_balanc > e_total=="total" & race=="white" & births__total==16702 (1 real change made) . replace state="montana" if state=="missouri" & county=="total" & city_balanc > e_total=="total" & race=="nonwhite" & births__total==1030 (1 real change made) . replace state="montana" if state=="missouri" & county=="carter" & page__of_p > df_==135 (1 real change made) . replace state="montana" if state=="missouri" & county=="jefferson" & city_ba > lance_total=="total" & race=="total" & births__total==66 (1 real change made) . replace state="montana" if state=="missouri" & county=="lincoln" & city_bala > nce_total=="total" & race=="total" & births__total==360 (1 real change made) . replace state="montana" if state=="missouri" & county=="madison" & city_bala > nce_total=="total" & race=="total" & births__total==96 (1 real change made) . replace city_balance_total="roselle" if state=="new jersey" & county=="union > " & city_balance_total=="total" & race=="white" & births__total==360 (1 real change made) . replace city_balance_total="roselle" if state=="new jersey" & county=="union > " & city_balance_total=="total" & race=="nonwhite" & births__total==74 (1 real change made) . replace county="alexander" if state=="north carolina" & county=="alamance" & > city_balance_total=="total" & race=="total" & births__total==340 (1 real change made) . replace county="onslow" if state=="north carolina" & county=="northampton" & > city_balance_total=="total" & race=="total" & births__total==3264 (1 real change made) . replace county="onslow" if state=="north carolina" & county=="northampton" & > city_balance_total=="total" & race=="white" & births__total==2868 (1 real change made) . replace county="onslow" if state=="north carolina" & county=="northampton" & > city_balance_total=="total" & race=="nonwhite" & births__total==396 (1 real change made) . replace county="orange" if state=="north carolina" & county=="northampton" & > city_balance_total=="total" & race=="total" & births__total==974 (1 real change made) . replace county="orange" if state=="north carolina" & county=="northampton" & > city_balance_total=="total" & race=="white" & births__total==638 (1 real change made) . replace county="orange" if state=="north carolina" & county=="northampton" & > city_balance_total=="total" & race=="nonwhite" & births__total==336 (1 real change made) . replace county="pamlico" if state=="north carolina" & county=="northampton" > & city_balance_total=="total" & race=="total" & births__total==232 (1 real change made) . replace county="pamlico" if state=="north carolina" & county=="northampton" > & city_balance_total=="total" & race=="white" & births__total==98 (1 real change made) . replace county="pamlico" if state=="north carolina" & county=="northampton" > & city_balance_total=="total" & race=="nonwhite" & births__total==134 (1 real change made) . replace county="union" if state=="north carolina" & county=="tyrrell" & city > _balance_total=="total" & race=="total" & births__total==1142 (1 real change made) . replace county="union" if state=="north carolina" & county=="tyrrell" & city > _balance_total=="total" & race=="white" & births__total==800 (1 real change made) . replace county="union" if state=="north carolina" & county=="tyrrell" & city > _balance_total=="total" & race=="nonwhite" & births__total==342 (1 real change made) . replace city_balance_total="coffeyville" if state=="kansas" & county=="montg > omery" & city_balance_total=="total" & race=="white" & births__total= > =368 (1 real change made) . replace city_balance_total="coffeyville" if state=="kansas" & county=="montg > omery" & city_balance_total=="total" & race=="nonwhite" & births__total==58 (1 real change made) . . *fix county name misspellings (relative to county names in the 1970 census) . replace county=subinstr(county,"st.","st",.) (110 real changes made) . replace county=subinstr(county,"ste.","st",.) (1 real change made) . replace county="baltimore city" if county=="baltimore (city)" & state=="mary > land" (3 real changes made) . replace county="cochrane" if county=="cochran" & state=="texas" (1 real change made) . replace county="desoto" if county=="de soto" & state=="florida" (3 real changes made) . replace county="denver" if county=="denver, coextensive with denver (city)" > & state=="colorado" (1 real change made) . replace county="orleans" if county=="orleans, coextensive with new orleans" > & state=="louisiana" (3 real changes made) . replace county="philadelphia" if county=="philadelphia, coextensive with phi > ladelphia (city)" & state=="pennsylvania" (3 real changes made) . replace county="san francisco" if county=="san francisco, coextensive with s > an francisco (city)" & state=="california" (3 real changes made) . replace county="st louis city" if county=="st louis (city)" & state=="missou > ri" (3 real changes made) . replace county="acadia" if county=="acadie" & state=="louisiana" (9 real changes made) . replace county="adams" if county=="adems" & state=="illinois" (3 real changes made) . replace county="allegheny" if county=="allegbeny" & state=="pennsylvania" (37 real changes made) . replace county="emanuel" if county=="eeanuel" & state=="georgia" (3 real changes made) . replace county="manatee" if county=="manstee" & state=="florida" (9 real changes made) . replace county="maricopa" if county=="mari copa" & state=="arizona" (8 real changes made) . replace county="mississippi" if county=="missicsippi" & state=="missouri" (3 real changes made) . replace county="o'brien" if county=="obrien" & state=="iowa" (1 real change made) . replace county="pulaski" if county=="pulacki" & state=="indiana" (1 real change made) . replace county="sangamon" if county=="sangemon" & state=="illinois" (3 real changes made) . replace county="santa rosa" if county=="senta rosa" & state=="florida" (1 real change made) . replace county="suwannee" if county=="suwennee" & state=="florida" (3 real changes made) . replace county="tallahatchie" if county=="tallahatchfe" & state=="mississipp > i" (3 real changes made) . replace county="umatilla" if county=="umstilla" & state=="oregon" (3 real changes made) . replace county="wabash" if county=="wabach" & state=="illinois" (1 real change made) . replace county="yamhill" if county=="yambill" & state=="oregon" (1 real change made) . replace county="vernon" if county=="vernoh" & state=="louisiana" (3 real changes made) . replace state="montana" if county=="beaverhead" & state=="missouri" (1 real change made) . replace state="montana" if county=="big horn" & state=="missouri" (3 real changes made) . replace state="montana" if county=="blaine" & state=="missouri" (3 real changes made) . replace state="montana" if county=="broadwater" & state=="missouri" (1 real change made) . replace state="montana" if county=="carbon" & state=="missouri" (1 real change made) . replace state="montana" if county=="cascade" & state=="missouri" (3 real changes made) . replace state="montana" if county=="chouteau" & state=="missouri" (1 real change made) . replace state="montana" if county=="custer" & state=="missouri" (1 real change made) . replace state="montana" if county=="daniels" & state=="missouri" (1 real change made) . replace state="montana" if county=="dawson" & state=="missouri" (1 real change made) . replace state="montana" if county=="deer lodge" & state=="missouri" (3 real changes made) . replace state="montana" if county=="fallon" & state=="missouri" (1 real change made) . replace state="montana" if county=="fergus" & state=="missouri" (1 real change made) . replace state="montana" if county=="flathead" & state=="missouri" (1 real change made) . replace state="montana" if county=="gallatin" & state=="missouri" (3 real changes made) . replace state="montana" if county=="garfield" & state=="missouri" (1 real change made) . replace state="montana" if county=="glacier" & state=="missouri" (3 real changes made) . replace state="montana" if county=="golden valley" & state=="missouri" (1 real change made) . replace state="montana" if county=="granite" & state=="missouri" (1 real change made) . replace state="montana" if county=="hill" & state=="missouri" (1 real change made) . replace state="montana" if county=="judith basin" & state=="missouri" (1 real change made) . replace state="montana" if county=="lake" & state=="missouri" (3 real changes made) . replace state="montana" if county=="lewis and clark" & state=="missouri" (3 real changes made) . replace state="montana" if county=="mccone" & state=="missouri" (1 real change made) . replace state="montana" if county=="meagher" & state=="missouri" (1 real change made) . replace state="montana" if county=="mineral" & state=="missouri" (1 real change made) . replace state="montana" if county=="missoula" & state=="missouri" (3 real changes made) . replace state="montana" if county=="musselshell" & state=="missouri" (1 real change made) . . *correct data entry errors found while checking that county totals sum to st > ate totals . replace births_of_residents_of_area__att=408 if births_of_residents_of_ar > ea__att==409 & state=="georgia" & county=="gordon" & city_b > alance_total=="total" & race=="total" (1 real change made) . . *check that county names are consistent with 1970 census, except for known d > eviations . *the additional contradictions are counties in massachusetts, since data for > massachusetts is only shown for the state as a whole in the year 1956, as s > pecified in footnote 8 on page 185 . preserve . egen countyr_state=concat(county state), punct("_") . sort countyr_state . merge countyr_state using original_census1970_counties variable countyr_state does not uniquely identify observations in the master data variable countyr_state does not uniquely identify observations in original_census1970_counties.dta . assert _m==3|county=="armstrong" & state=="south dakota"|county=="washington > " & state=="south dakota"|county=="yellowstone national park (part)" & state > =="idaho"|county=="yellowstone national park" & state=="idaho"|county=="yell > owstone national park (part)" & state=="montana"|county=="yellowstone nation > al park (part)" & state=="wyoming"|county=="yellowstone national park (total > )" & state=="wyoming"|county=="yellowstone national park" & state=="montana" > |county=="park (excl yell nat park)" & state=="montana"|county=="yellowstone > national park, part" & state=="montana"|county=="yellowstone national park, > part" & state=="wyoming"|county=="yellowstone national park, part" & state= > ="montana"|county=="yellowstone national park, total" & state=="wyoming"|cou > nty=="park" & state=="montana"|county=="yellowstone nat. park (part)" & stat > e=="idaho"|county=="yellowstone nat. park (part)" & state=="wyoming"|county= > ="yellowstone nat. park (total)" & state=="wyoming"|county=="new york city" > & state=="new york"|county=="bronx" & state=="new york"|county=="kings" & st > ate=="new york"|county=="new york" & state=="new york"|county=="queens" & st > ate=="new york"|county=="richmond" & state=="new york"|county=="ormsby" & st > ate=="nevada"|county=="carson city" & state=="nevada"|county=="los alamos" & > state=="new mexico"|county=="menominee" & state=="wisconsin"|county=="total > "|state=="alaska"|state=="hawaii"|state=="virginia"|state=="dc"|state=="mass > achusetts" . restore . . *check for mistakes/misspellings in state names . assert state=="alabama"|state=="alaska"|state=="arizona"|state=="arkansas"|s > tate=="california"|state=="colorado"|state=="connecticut"|state=="delaware"| > state=="district of columbia"|state=="florida"|state=="georgia"|state=="hawa > ii"|state=="idaho"|state=="illinois"|state=="indiana"|state=="iowa"|state==" > kansas"|state=="kentucky"|state=="louisiana"|state=="maine"|state=="maryland > "|state=="massachusetts"|state=="michigan"|state=="minnesota"|state=="missis > sippi"|state=="missouri"|state=="montana"|state=="nebraska"|state=="nevada"| > state=="new hampshire"|state=="new jersey"|state=="new mexico"|state=="new y > ork"|state=="north carolina"|state=="north dakota"|state=="ohio"|state=="okl > ahoma"|state=="oregon"|state=="pennsylvania"|state=="rhode island"|state=="s > outh carolina"|state=="south dakota"|state=="tennessee"|state=="texas"|state > =="utah"|state=="vermont"|state=="virginia"|state=="washington"|state=="west > virginia"|state=="wisconsin"|state=="wyoming" . . *data checks for columns summing to county total . *not applicable; see footnote 1 . . *clean and label variables . rename page__of_pdf_ page_of_pdf_ . label var page_of_pdf_ "page of pdf" . label var state "state" . label var county "county" . rename city_balance_total sub_county . label var sub_county "city/balance/total" . label var race "race" . rename births__total births . label var births "births by residence" . rename births_of_residents_of_area__att births_h_p . label var births_h_p "births by residence: physician in hospital" . rename births_of_residents_of_area__at0 births_nh_p . label var births_nh_p "births by residence: physician not in hospital" . rename births_of_residents_of_area__at1 births_m . label var births_m "births by residence: midwife" . . *generate year variable . gen year=1956 . label var year "year" . . *check that observations are unique . egen tag=tag(state county sub_county race) . assert tag==1 . drop tag . . list in 1/1 +--------------------------------------------------------------------+ 1. | page_o~_ | state | county | sub_co~y | race | births | bir~_h_p | | 83 | alabama | total | total | total | 84062 | 66318 | |--------------------------------------------------------------------| | bir~nh_p | births_m | year | | 3720 | 13742 | 1956 | +--------------------------------------------------------------------+ . desc Contains data from natality1956.dta obs: 8,143 vars: 10 size: 1,091,162 (89.6% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page_of_pdf_ int %8.0g page of pdf state str20 %20s state county str52 %52s county sub_county str32 %32s city/balance/total race str8 %8s race births long %12.0g births by residence births_h_p long %12.0g births by residence: physician in hospital births_nh_p int %8.0g births by residence: physician not in hospital births_m int %8.0g births by residence: midwife year float %9.0g year ------------------------------------------------------------------------------ > - Sorted by: page_of_pdf_ Note: dataset has changed since last saved . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page_of_pdf_ | 8143 133.695 29.52136 83 185 state | 0 county | 0 sub_county | 0 race | 0 -------------+-------------------------------------------------------- births | 8143 2341.056 12993.88 0 346938 births_h_p | 8143 2229.252 12642.37 0 343274 births_nh_p | 8143 46.4281 276.792 0 9042 births_m | 8143 60.4755 528.3245 0 17108 year | 8143 1956 0 1956 1956 . saveold clean_natality1956.dta,replace file clean_natality1956.dta saved . clear . . ** . *1957 data . ** . *http://nber15.nber.org/vital-stats-books/vsus_1957_1.cv.pdf . *table 24 . . *hand-checked excel file for ?'s (difficult-to-read characters) . *corrected below if any . . clear . use natality1957.dta . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page__of_p~_ | 8144 149.6983 29.5211 99 201 state | 0 county | 0 city_balan~l | 0 race | 0 -------------+-------------------------------------------------------- births__to~l | 8108 2402.789 13403.85 2 359240 births_of_~t | 8110 2301.473 13078.25 2 355778 births_of_~0 | 6329 52.87739 278.1449 2 8122 births_of_~1 | 3849 118.1663 704.7255 2 15900 . desc Contains data from natality1957.dta obs: 8,145 vars: 9 size: 1,050,705 (90.0% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page__of_pdf_ int %8.0g state str20 %20s county str51 %51s city_balance_~l str32 %32s race str8 %8s births__total long %12.0g births_of_res~t long %12.0g births_of_res~0 int %8.0g births_of_res~1 int %8.0g ------------------------------------------------------------------------------ > - Sorted by: . replace county=lower(county) (8144 real changes made) . replace state=lower(state) (8144 real changes made) . replace city_balance_total=lower(city_balance_total) (8145 real changes made) . replace race=lower(race) (8145 real changes made) . . *missing observations in the original data are zeros . *replace these so we can distinguish these from "true" missing values . *which would be generated below due to difficult-to-read characters . count if births__total==. 37 . replace births__total=0 if births__total ==. (37 real changes made) . count if births_of_residents_of_area__att==. 35 . replace births_of_residents_of_area__att =0 if births_of_residents_of_area > __att ==. (35 real changes made) . count if births_of_residents_of_area__at0==. 1816 . replace births_of_residents_of_area__at0 =0 if births_of_residents_of_area > __at0 ==. (1816 real changes made) . count if births_of_residents_of_area__at1==. 4296 . replace births_of_residents_of_area__at1 =0 if births_of_residents_of_area > __at1 ==. (4296 real changes made) . . *check that all pdf pages appear to be in the data . sort page__of_pdf_ . gen temp=page__of_pdf_[_n]-page__of_pdf_[_n-1] (2 missing values generated) . assert temp==0|temp==1|temp==. . drop temp . . *clean data entry errors . replace births__total=196 if state=="georgia" & county=="lee" & city_balance > _total=="total" & race=="total" (1 real change made) . replace births__total=44 if state=="georgia" & county=="lee" & city_balance_ > total=="total" & race=="white" (1 real change made) . replace births__total=3466 if state=="indiana" & county=="lake" & city_balan > ce_total=="balance of county" & race=="total" (1 real change made) . replace births__total=2164 if state=="indiana" & county=="la porte" & city_b > alance_total=="total" & race=="total" (1 real change made) . replace births__total=796 if state=="indiana" & county=="la porte" & city_ba > lance_total=="michigan city" & race=="total" (1 real change made) . replace births__total=550 if state=="indiana" & county=="la porte" & city_ba > lance_total=="la porte" & race=="total" (1 real change made) . replace births__total=332 if state=="indiana" & county=="lawrence" & city_ba > lance_total=="bedford" & race=="total" (1 real change made) . replace births__total=554 if state=="indiana" & county=="lawrence" & city_ba > lance_total=="balance of county" & race=="total" (1 real change made) . drop if state=="" (1 observation deleted) . *drops one empty observation . replace county="ellis" if county=="ell?" (1 real change made) . duplicates report Duplicates in terms of all variables -------------------------------------- copies | observations surplus ----------+--------------------------- 1 | 8142 0 2 | 2 1 -------------------------------------- . duplicates list Duplicates in terms of all variables +----------------------------------------------------------------+ | obs: | page__~_ | state | county | city_b~l | race | | 6245 | 177 | south carolina | barnwell | total | white | |----------------------------------------------------------------| | births~l | births~t | births~0 | births~1 | | 274 | 268 | 4 | 2 | +----------------------------------------------------------------+ +----------------------------------------------------------------+ | obs: | page__~_ | state | county | city_b~l | race | | 6246 | 177 | south carolina | barnwell | total | white | |----------------------------------------------------------------| | births~l | births~t | births~0 | births~1 | | 274 | 268 | 4 | 2 | +----------------------------------------------------------------+ . duplicates drop Duplicates in terms of all variables (1 observation deleted) . *drops one observation which appears to have been entered twice . replace city_balance_total="heyward" if state=="california" & county=="alame > da" & city_balance_total=="albany" & births__total==2546 (1 real change made) . replace race="nonwhite" if state=="mississippi" & county=="claiborne" & city > _balance_total=="total" & race=="total" & births__total==252 (1 real change made) . replace race="white" if state=="mississippi" & county=="claiborne" & city_ba > lance_total=="total" & race=="total" & births__total==48 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="white" & births__total==5550 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="total" & births__total==5106 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="total" & births__total==8198 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="nonwhite" & births__total==2732 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="white" & births__total==2374 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="nonwhite" & births__total==2648 (1 real change made) . replace county="roanoke (ind. city)" if county=="roanoke" & state=="virginia > " & race=="total" & births__total==2520 (1 real change made) . replace city_balance_total="milwaukee" if state=="wisconsin" & county=="milw > aukee" & city_balance_total=="total" & race=="white" & births__total==17318 (1 real change made) . replace city_balance_total="milwaukee" if state=="wisconsin" & county=="milw > aukee" & city_balance_total=="total" & race=="total" & births__total==19850 (1 real change made) . replace city_balance_total="milwaukee" if state=="wisconsin" & county=="milw > aukee" & city_balance_total=="total" & race=="nonwhite" & births__total==253 > 2 (1 real change made) . . *fix county name misspellings (relative to county names in the 1970 census) . replace county=subinstr(county,"st.","st",.) (107 real changes made) . replace county=subinstr(county,"ste.","st",.) (1 real change made) . replace county="baltimore city" if county=="baltimore (city)" & state=="mary > land" (3 real changes made) . replace county="cochrane" if county=="cochran" & state=="texas" (1 real change made) . replace county="calhoun" if state=="illinois" & county=="calboun" (1 real change made) . replace county="cross" if state=="arkansas" & county=="crosa" (3 real changes made) . replace county="desoto" if county=="de soto" & state=="florida" (3 real changes made) . replace county="denver" if county=="denver, coextensive with denver (city)" > & state=="colorado" (1 real change made) . replace county="duval" if state=="texas" & county=="duvel" (1 real change made) . replace county="hubbard" if state=="minnesota" & county=="houbbard" (1 real change made) . replace county="st john the baptist" if state=="louisiana" & county=="john t > he baptist" (3 real changes made) . replace county="merrick" if state=="nebraska" & county=="marrick" (1 real change made) . replace county="mcintosh" if state=="north dakota" & county=="mcintoch" (1 real change made) . replace county="nemaha" if state=="kansas" & county=="nemaba" (1 real change made) . replace county="orleans" if county=="orleans, coextensive with new orleans" > & state=="louisiana" (3 real changes made) . replace county="philadelphia" if county=="philadelphia, coextensive with phi > ladelphia (city)" & state=="pennsylvania" (3 real changes made) . replace county="rooks" if state=="kansas" & county=="roocks" (1 real change made) . replace county="san francisco" if county=="san francisco coextensive with sa > n francisco (city)" & state=="california" (3 real changes made) . replace county="st louis city" if county=="st louis (city)" & state=="missou > ri" (3 real changes made) . replace county="upshur" if state=="texas" & county=="upahur" (3 real changes made) . . *correct data entry errors found while checking that county totals sum to st > ate totals . replace city_balance_total="total" if state=="california" & county=="san fra > ncisco" & city_balance_total=="totol" (3 real changes made) . replace births__total=162 if births__total==138 & state=="georgia" > & county=="lincoln" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=146 if births_of_residents_of_ar > ea__att==148 & state=="colorado" & county=="eagle" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=80 if births_of_residen > ts_of_area__att==90 & state=="idaho" & county=="valle > y" & city_balance_total=="total" & race=="total" (1 real change made) . . *correct data entry errors found while checking that white+nonwhite=total . replace births__total=272 if births__total==172 & state=="georgia" > & county=="liberty" & race=="white" & city_balance_total=="t > otal" (1 real change made) . replace births__total=214 if births__total==819 & state=="georgia" > & county=="liberty" & race=="nonwhite" & city_balance_total=="t > otal" (1 real change made) . replace births__total=6748 if births__total==6743 & state=="oklahoma" > & county=="total" & race=="nonwhite" & city_balance_total=="t > otal" (1 real change made) . replace births_of_residents_of_area__att=266 if births_of_residents_of_ar > ea__att==286 & state=="north carolina" & county=="hertford" > & race=="nonwhite" & city_balance_total=="total" (1 real change made) . . *check that county names are consistent with 1970 census, except for known d > eviations . *the additional contradictions are counties in massachusetts, since data for > massachusetts is only shown for the state as a whole in the year 1957, as s > pecified in footnote 7 on page 201 . preserve . egen countyr_state=concat(county state), punct("_") . sort countyr_state . merge countyr_state using original_census1970_counties variable countyr_state does not uniquely identify observations in the master data variable countyr_state does not uniquely identify observations in original_census1970_counties.dta . assert _m==3|county=="armstrong" & state=="south dakota"|county=="washington > " & state=="south dakota"|county=="yellowstone national park (part)" & state > =="idaho"|county=="yellowstone national park" & state=="idaho"|county=="yell > owstone national park (part)" & state=="montana"|county=="yellowstone nation > al park (part)" & state=="wyoming"|county=="yellowstone national park (total > )" & state=="wyoming"|county=="yellowstone national park" & state=="montana" > |county=="park (excl yell nat park)" & state=="montana"|county=="yellowstone > national park, part" & state=="montana"|county=="yellowstone national park, > part" & state=="wyoming"|county=="yellowstone national park, part" & state= > ="montana"|county=="yellowstone national park, total" & state=="wyoming"|cou > nty=="park" & state=="montana"|county=="yellowstone nat. park (part)" & stat > e=="idaho"|county=="yellowstone nat. park (part)" & state=="wyoming"|county= > ="yellowstone nat. park (total)" & state=="wyoming"|county=="new york city" > & state=="new york"|county=="bronx" & state=="new york"|county=="kings" & st > ate=="new york"|county=="new york" & state=="new york"|county=="queens" & st > ate=="new york"|county=="richmond" & state=="new york"|county=="ormsby" & st > ate=="nevada"|county=="carson city" & state=="nevada"|county=="los alamos" & > state=="new mexico"|county=="menominee" & state=="wisconsin"|county=="total > "|state=="alaska"|state=="hawaii"|state=="virginia"|state=="dc"|state=="mass > achusetts" . restore . . *check for mistakes/misspellings in state names . assert state=="alabama"|state=="alaska"|state=="arizona"|state=="arkansas"|s > tate=="california"|state=="colorado"|state=="connecticut"|state=="delaware"| > state=="district of columbia"|state=="florida"|state=="georgia"|state=="hawa > ii"|state=="idaho"|state=="illinois"|state=="indiana"|state=="iowa"|state==" > kansas"|state=="kentucky"|state=="louisiana"|state=="maine"|state=="maryland > "|state=="massachusetts"|state=="michigan"|state=="minnesota"|state=="missis > sippi"|state=="missouri"|state=="montana"|state=="nebraska"|state=="nevada"| > state=="new hampshire"|state=="new jersey"|state=="new mexico"|state=="new y > ork"|state=="north carolina"|state=="north dakota"|state=="ohio"|state=="okl > ahoma"|state=="oregon"|state=="pennsylvania"|state=="rhode island"|state=="s > outh carolina"|state=="south dakota"|state=="tennessee"|state=="texas"|state > =="utah"|state=="vermont"|state=="virginia"|state=="washington"|state=="west > virginia"|state=="wisconsin"|state=="wyoming" . . *data checks for columns summing to county total . *not applicable; see footnote 1 . . *clean and label variables . rename page__of_pdf_ page_of_pdf_ . label var page_of_pdf_ "page of pdf" . label var state "state" . label var county "county" . rename city_balance_total sub_county . label var sub_county "city/balance/total" . label var race "race" . rename births__total births . label var births "births by residence" . rename births_of_residents_of_area__att births_h_p . label var births_h_p "births by residence: physician in hospital" . rename births_of_residents_of_area__at0 births_nh_p . label var births_nh_p "births by residence: physician not in hospital" . rename births_of_residents_of_area__at1 births_m . label var births_m "births by residence: midwife" . . *generate year variable . gen year=1957 . label var year "year" . . *check that observations are unique . egen tag=tag(state county sub_county race) . assert tag==1 . drop tag . . list in 1/1 +--------------------------------------------------------------------+ 1. | page_o~_ | state | county | sub_co~y | race | births | bir~_h_p | | 99 | alabama | total | total | total | 84052 | 67882 | |--------------------------------------------------------------------| | bir~nh_p | births_m | year | | 3052 | 12914 | 1957 | +--------------------------------------------------------------------+ . desc Contains data from natality1957.dta obs: 8,143 vars: 10 size: 1,083,019 (89.7% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page_of_pdf_ int %8.0g page of pdf state str20 %20s state county str51 %51s county sub_county str32 %32s city/balance/total race str8 %8s race births long %12.0g births by residence births_h_p long %12.0g births by residence: physician in hospital births_nh_p int %8.0g births by residence: physician not in hospital births_m int %8.0g births by residence: midwife year float %9.0g year ------------------------------------------------------------------------------ > - Sorted by: page_of_pdf_ Note: dataset has changed since last saved . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page_of_pdf_ | 8143 149.695 29.52136 99 201 state | 0 county | 0 sub_county | 0 race | 0 -------------+-------------------------------------------------------- births | 8143 2393.344 13375.85 0 359240 births_h_p | 8143 2292.109 13052.55 0 355778 births_nh_p | 8143 41.09751 246.1957 0 8122 births_m | 8143 55.85411 488.0549 0 15900 year | 8143 1957 0 1957 1957 . saveold clean_natality1957.dta,replace file clean_natality1957.dta saved . clear . . ** . *1958 data . ** . *http://nber15.nber.org/vital-stats-books/vsus_1958_1.cv.pdf . *table 25 . . *hand-checked excel file for ?'s (difficult-to-read characters) . *corrected below if any . . clear . use natality1958.dta . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page__of_p~_ | 8141 165.6922 29.52162 115 217 state | 0 county | 0 city_balan~l | 0 race | 0 -------------+-------------------------------------------------------- births__to~l | 8128 2371.972 13212.24 2 360662 births_of_~t | 8125 2279.975 12920.23 2 357186 births_of_~0 | 6070 47.8201 244.1332 2 6018 births_of_~1 | 3801 112.5593 668.3932 2 14960 . desc Contains data from natality1958.dta obs: 8,141 vars: 9 size: 1,058,330 (89.9% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page__of_pdf_ int %8.0g state str20 %20s county str52 %52s city_balance_~l str32 %32s race str8 %8s births__total long %12.0g births_of_res~t long %12.0g births_of_res~0 int %8.0g births_of_res~1 int %8.0g ------------------------------------------------------------------------------ > - Sorted by: . replace county=lower(county) (8141 real changes made) . replace state=lower(state) (8141 real changes made) . replace city_balance_total=lower(city_balance_total) (8138 real changes made) . replace race=lower(race) (8141 real changes made) . . *missing observations in the original data are zeros . *replace these so we can distinguish these from "true" missing values . *which would be generated below due to difficult-to-read characters . count if births__total==. 13 . replace births__total=0 if births__total ==. (13 real changes made) . count if births_of_residents_of_area__att==. 16 . replace births_of_residents_of_area__att =0 if births_of_residents_of_area > __att ==. (16 real changes made) . count if births_of_residents_of_area__at0==. 2071 . replace births_of_residents_of_area__at0 =0 if births_of_residents_of_area > __at0 ==. (2071 real changes made) . count if births_of_residents_of_area__at1==. 4340 . replace births_of_residents_of_area__at1 =0 if births_of_residents_of_area > __at1 ==. (4340 real changes made) . . *check that all pdf pages appear to be in the data . sort page__of_pdf_ . gen temp=page__of_pdf_[_n]-page__of_pdf_[_n-1] (1 missing value generated) . assert temp==0|temp==1|temp==. . drop temp . . *clean data entry errors . replace births_of_residents_of_area__att=3016 if state=="kansas" & county==" > johnson" & city_balance_total=="total" & race=="total" (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="total" & births__total==7760 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="white" & births__total==5314 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="nonwhite" & births__total==2446 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="total" & births__total==5062 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="white" & births__total==2306 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="nonwhite" & births__total==2756 (1 real change made) . replace county="roanoke (ind. city)" if county=="roanoke" & state=="virginia > " & race=="total" & births__total==2372 (1 real change made) . replace city_balance_total="total" if state=="california" & county=="santa c > lara" & births__total==14320 (1 real change made) . replace city_balance_total="balance of city" if state=="north carolina" & co > unty=="rockingham" & births__total==992 (1 real change made) . replace city_balance_total="balance of city" if state=="north carolina" & co > unty=="rockingham" & births__total==300 (1 real change made) . // add in observation for nonwhite, Woodford, KY, which was completely left > out . local onemore = _N+1 . set obs `onemore' obs was 8141, now 8142 . replace page__of_pdf_ = 150 if _n==_N (1 real change made) . replace state = "kentucky" if _ > n==_N (1 real change made) . replace county = "woodford" if _n==_N (1 real change made) . replace city_balance_total = "total" if _n==_N (1 real change made) . replace race = "nonwhite" if _ > n==_N (1 real change made) . replace births__total = 50 if _ > n==_N (1 real change made) . replace births_of_residents_of_area__att = 48 if _n==_N (1 real change made) . replace births_of_residents_of_area__at0 = 2 if _n==_N (1 real change made) . replace births_of_residents_of_area__at1 = 0 if _n==_N (1 real change made) . gen order_var = _n . sort state order_var, stable . drop order_var . . *fix county name misspellings (relative to county names in the 1970 census) . replace county=subinstr(county,"st.","st",.) (110 real changes made) . replace county=subinstr(county,"ste.","st",.) (1 real change made) . replace county="appanoose" if state=="iowa" & county=="appanocse" (1 real change made) . replace county="baltimore city" if county=="baltimore (city)" & state=="mary > land" (3 real changes made) . replace county="cherokee" if state=="iowa" & county=="cherckee" (1 real change made) . replace county="gloucester" if state=="new jersey" & county=="cloucester" (5 real changes made) . replace county="cochrane" if county=="cochran" & state=="texas" (1 real change made) . replace county="desoto" if county=="de soto" & state=="florida" (3 real changes made) . replace county="denver" if county=="denver, coextensive with denver (city)" > & state=="colorado" (1 real change made) . replace county="franklin" if state=="idaho" & county=="frenklin" (1 real change made) . replace county="lincoln" if state=="north carolina" & county=="idncoln" (3 real changes made) . replace county="oconee" if state=="georgia" & county=="oconec" (3 real changes made) . replace county="orleans" if county=="orleans, coextensive with new orleans" > & state=="louisiana" (3 real changes made) . replace county="philadelphia" if county=="philadelphia, coextensive with phi > ladelphia (city)" & state=="pennsylvania" (3 real changes made) . replace county="richardson" if state=="nebraska" & county=="richerdson" (1 real change made) . replace county="rooks" if state=="kansas" & county=="rocks" (1 real change made) . replace county="san francisco" if county=="san francisco, coextensive with s > an francisco (city)" & state=="california" (3 real changes made) . replace county="st louis city" if county=="st louis (city)" & state=="missou > ri" (3 real changes made) . replace county="unicoi" if state=="tennessee" & county=="unici" (1 real change made) . replace county="washita" if state=="oklahoma" & county=="washite" (1 real change made) . replace county="yakima" if state=="washington" & county=="yakina" (3 real changes made) . replace county="union" if state=="new jersey" & county=="new jersey" & birth > s__total==192 (1 real change made) . replace county="union" if state=="new jersey" & county=="new jersey" & birth > s__total==406 (1 real change made) . replace county="union" if state=="new jersey" & county=="new jersey" & birth > s__total==2018 (1 real change made) . replace county="union" if state=="new jersey" & county=="new jersey" & birth > s__total==418 (1 real change made) . replace county="union" if state=="new jersey" & county=="new jersey" & birth > s__total==480 (1 real change made) . replace county="union" if state=="new jersey" & county=="new jersey" & birth > s__total==74 (1 real change made) . replace county="union" if state=="new jersey" & county=="new jersey" & birth > s__total==524 (1 real change made) . replace county="hinds" if state=="mississippi" & county=="harrison" & births > __total==5242 (1 real change made) . replace county="hinds" if state=="mississippi" & county=="harrison" & births > __total==2664 (1 real change made) . replace county="hinds" if state=="mississippi" & county=="harrison" & births > __total==2578 (1 real change made) . replace county="hinds" if state=="mississippi" & county=="harrison" & births > __total==3242 (1 real change made) . replace county="hinds" if state=="mississippi" & county=="harrison" & births > __total==1506 (1 real change made) . replace county="hinds" if state=="mississippi" & county=="harrison" & births > __total==1736 (1 real change made) . replace county="hinds" if state=="mississippi" & county=="harrison" & births > __total==2000 (1 real change made) . replace county="hinds" if state=="mississippi" & county=="harrison" & births > __total==1158 (1 real change made) . replace county="hinds" if state=="mississippi" & county=="harrison" & births > __total==842 (1 real change made) . replace county="madison" if state=="louisiana" & county=="livingston" & city > _balance_total=="lison" (3 real changes made) . replace city_balance_total="total" if state=="louisiana" & county=="madison" (3 real changes made) . . *correct data entry errors found while checking that county totals sum to st > ate totals . replace births_of_residents_of_area__att=186 if births_of_residents_of_ar > ea__att==185 & state=="georgia" & county=="wilkinson" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=126 if births_of_residents_of_ar > ea__att==128 & state=="kansas" & county=="gove" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=886 if births_of_residents_of_ar > ea__att==986 & state=="north dakota" & county=="burleigh" & city_b > alance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=66 if births_of_residen > ts_of_area__att==86 & state=="north dakota" & county=="slope" > & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=1268 if births_of_residents_of_ar > ea__att==1269 & state=="washington" & county=="cowlitz" > & city_balance_total=="total" & race=="total" (1 real change made) . . *check that county names are consistent with 1970 census, except for known d > eviations . *the additional contradictions are counties in massachusetts, since data for > massachusetts is only shown for the state as a whole in the year 1958, as s > pecified in footnote 7 on page 217 . preserve . egen countyr_state=concat(county state), punct("_") . sort countyr_state . merge countyr_state using original_census1970_counties variable countyr_state does not uniquely identify observations in the master data variable countyr_state does not uniquely identify observations in original_census1970_counties.dta . assert _m==3|county=="armstrong" & state=="south dakota"|county=="washington > " & state=="south dakota"|county=="yellowstone national park (part)" & state > =="idaho"|county=="yellowstone national park" & state=="idaho"|county=="yell > owstone national park (part)" & state=="montana"|county=="yellowstone nation > al park (part)" & state=="wyoming"|county=="yellowstone national park (total > )" & state=="wyoming"|county=="yellowstone national park" & state=="montana" > |county=="park (excl yell nat park)" & state=="montana"|county=="yellowstone > national park, part" & state=="montana"|county=="yellowstone national park, > part" & state=="wyoming"|county=="yellowstone national park, part" & state= > ="montana"|county=="yellowstone national park, total" & state=="wyoming"|cou > nty=="park" & state=="montana"|county=="yellowstone nat. park (part)" & stat > e=="idaho"|county=="yellowstone nat. park (part)" & state=="wyoming"|county= > ="yellowstone nat. park (total)" & state=="wyoming"|county=="new york city" > & state=="new york"|county=="bronx" & state=="new york"|county=="kings" & st > ate=="new york"|county=="new york" & state=="new york"|county=="queens" & st > ate=="new york"|county=="richmond" & state=="new york"|county=="ormsby" & st > ate=="nevada"|county=="carson city" & state=="nevada"|county=="los alamos" & > state=="new mexico"|county=="menominee" & state=="wisconsin"|county=="total > "|state=="alaska"|state=="hawaii"|state=="virginia"|state=="dc"|state=="mass > achusetts" . restore . . *check for mistakes/misspellings in state names . assert state=="alabama"|state=="alaska"|state=="arizona"|state=="arkansas"|s > tate=="california"|state=="colorado"|state=="connecticut"|state=="delaware"| > state=="district of columbia"|state=="florida"|state=="georgia"|state=="hawa > ii"|state=="idaho"|state=="illinois"|state=="indiana"|state=="iowa"|state==" > kansas"|state=="kentucky"|state=="louisiana"|state=="maine"|state=="maryland > "|state=="massachusetts"|state=="michigan"|state=="minnesota"|state=="missis > sippi"|state=="missouri"|state=="montana"|state=="nebraska"|state=="nevada"| > state=="new hampshire"|state=="new jersey"|state=="new mexico"|state=="new y > ork"|state=="north carolina"|state=="north dakota"|state=="ohio"|state=="okl > ahoma"|state=="oregon"|state=="pennsylvania"|state=="rhode island"|state=="s > outh carolina"|state=="south dakota"|state=="tennessee"|state=="texas"|state > =="utah"|state=="vermont"|state=="virginia"|state=="washington"|state=="west > virginia"|state=="wisconsin"|state=="wyoming" . . *data checks for columns summing to county total . *not applicable; see footnote 1 . . *clean and label variables . rename page__of_pdf_ page_of_pdf_ . label var page_of_pdf_ "page of pdf" . label var state "state" . label var county "county" . rename city_balance_total sub_county . label var sub_county "city/balance/total" . label var race "race" . rename births__total births . label var births "births by residence" . rename births_of_residents_of_area__att births_h_p . label var births_h_p "births by residence: physician in hospital" . rename births_of_residents_of_area__at0 births_nh_p . label var births_nh_p "births by residence: physician not in hospital" . rename births_of_residents_of_area__at1 births_m . label var births_m "births by residence: midwife" . . *generate year variable . gen year=1958 . label var year "year" . . *check that observations are unique . egen tag=tag(state county sub_county race) . assert tag==1 . drop tag . . list in 1/1 +--------------------------------------------------------------------+ 1. | page_o~_ | state | county | sub_co~y | race | births | bir~_h_p | | 115 | alabama | total | total | total | 82428 | 66576 | |--------------------------------------------------------------------| | bir~nh_p | births_m | year | | 3060 | 12516 | 1958 | +--------------------------------------------------------------------+ . desc Contains data from natality1958.dta obs: 8,142 vars: 10 size: 1,091,028 (89.6% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page_of_pdf_ int %8.0g page of pdf state str20 %20s state county str52 %52s county sub_county str32 %32s city/balance/total race str8 %8s race births long %12.0g births by residence births_h_p long %12.0g births by residence: physician in hospital births_nh_p int %8.0g births by residence: physician not in hospital births_m int %8.0g births by residence: midwife year float %9.0g year ------------------------------------------------------------------------------ > - Sorted by: state Note: dataset has changed since last saved . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page_of_pdf_ | 8142 165.6902 29.52032 115 217 state | 0 county | 0 sub_county | 0 race | 0 -------------+-------------------------------------------------------- births | 8142 2367.899 13201.24 0 360662 births_h_p | 8142 2275.576 12907.13 0 357186 births_nh_p | 8142 35.65095 211.815 0 6018 births_m | 8142 52.54704 460.0918 0 14960 year | 8142 1958 0 1958 1958 . saveold clean_natality1958.dta,replace file clean_natality1958.dta saved . clear . . ** . *1959 data . ** . *http://nber15.nber.org/vital-stats-books/vsus_1959_1.cv.pdf . *table 25 . . *hand-checked excel file for ?'s (difficult-to-read characters) . *corrected below if any . . clear . use natality1959.dta . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page__of_p~_ | 8420 145.8451 29.51185 95 197 state | 0 county | 0 city_balan~l | 0 race | 0 -------------+-------------------------------------------------------- births__to~l | 8298 2367.334 13160.52 2 360820 births_of_~t | 8295 2283.836 12894.61 2 357556 births_of_~0 | 5896 43.52714 220.7442 1 4982 births_of_~1 | 3646 111.4772 656.9089 2 14720 . desc Contains data from natality1959.dta obs: 8,420 vars: 9 size: 1,094,600 (89.6% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page__of_pdf_ int %8.0g state str20 %20s county str52 %52s city_balance_~l str32 %32s race str8 %8s births__total long %12.0g births_of_res~t long %12.0g births_of_res~0 int %8.0g births_of_res~1 int %8.0g ------------------------------------------------------------------------------ > - Sorted by: . replace county=lower(county) (8420 real changes made) . replace state=lower(state) (8337 real changes made) . replace city_balance_total=lower(city_balance_total) (8420 real changes made) . replace race=lower(race) (8420 real changes made) . . *missing observations in the original data are zeros . *replace these so we can distinguish these from "true" missing values . *which would be generated below due to difficult-to-read characters . count if births__total==. 122 . replace births__total=0 if births__total ==. (122 real changes made) . count if births_of_residents_of_area__att==. 125 . replace births_of_residents_of_area__att =0 if births_of_residents_of_area > __att ==. (125 real changes made) . count if births_of_residents_of_area__at0==. 2524 . replace births_of_residents_of_area__at0 =0 if births_of_residents_of_area > __at0 ==. (2524 real changes made) . count if births_of_residents_of_area__at1==. 4774 . replace births_of_residents_of_area__at1 =0 if births_of_residents_of_area > __at1 ==. (4774 real changes made) . . *check that all pdf pages appear to be in the data . sort page__of_pdf_ . gen temp=page__of_pdf_[_n]-page__of_pdf_[_n-1] (1 missing value generated) . assert temp==0|temp==1|temp==. . drop temp . . *clean data entry errors . append using natality1959_append.dta (note: births_of_residents_of_area__at1 is str1 in using data but will be int > now) . *one county that was not originally entered . replace county=lower(county) (1 real change made) . replace state=lower(state) (1 real change made) . replace city_balance_total=lower(city_balance_total) (1 real change made) . replace race=lower(race) (1 real change made) . replace births_of_residents_of_area__at0=6 if state=="texas" & county=="gre > gg" & city_balance_total=="total" & race=="nonwhite" (1 real change made) . duplicates report Duplicates in terms of all variables -------------------------------------- copies | observations surplus ----------+--------------------------- 1 | 8415 0 2 | 6 3 -------------------------------------- . duplicates list Duplicates in terms of all variables +----------------------------------------------------------------+ | group: | obs: | page__~_ | state | county | city_b~l | | 1 | 8028 | 192 | virginia | south norfolk | total | |----------------------------------------------------------------| | race | births~l | births~t | births~0 | births~1 | | nonwhite | 174 | 144 | 2 | 28 | +----------------------------------------------------------------+ +----------------------------------------------------------------+ | group: | obs: | page__~_ | state | county | city_b~l | | 1 | 8031 | 192 | virginia | south norfolk | total | |----------------------------------------------------------------| | race | births~l | births~t | births~0 | births~1 | | nonwhite | 174 | 144 | 2 | 28 | +----------------------------------------------------------------+ +----------------------------------------------------------------+ | group: | obs: | page__~_ | state | county | city_b~l | | 2 | 8026 | 192 | virginia | south norfolk | total | |----------------------------------------------------------------| | race | births~l | births~t | births~0 | births~1 | | total | 598 | 568 | 2 | 28 | +----------------------------------------------------------------+ +----------------------------------------------------------------+ | group: | obs: | page__~_ | state | county | city_b~l | | 2 | 8029 | 192 | virginia | south norfolk | total | |----------------------------------------------------------------| | race | births~l | births~t | births~0 | births~1 | | total | 598 | 568 | 2 | 28 | +----------------------------------------------------------------+ +----------------------------------------------------------------+ | group: | obs: | page__~_ | state | county | city_b~l | | 3 | 8027 | 192 | virginia | south norfolk | total | |----------------------------------------------------------------| | race | births~l | births~t | births~0 | births~1 | | white | 424 | 424 | 0 | 0 | +----------------------------------------------------------------+ +----------------------------------------------------------------+ | group: | obs: | page__~_ | state | county | city_b~l | | 3 | 8030 | 192 | virginia | south norfolk | total | |----------------------------------------------------------------| | race | births~l | births~t | births~0 | births~1 | | white | 424 | 424 | 0 | 0 | +----------------------------------------------------------------+ . duplicates drop Duplicates in terms of all variables (3 observations deleted) . *drops three observations which appear to have been entered twice . replace state="iowa" if state=="indiana" & county=="adair" (1 real change made) . replace state="iowa" if state=="indiana" & county=="allamakee" (1 real change made) . replace state="iowa" if state=="indiana" & county=="appanoose" (1 real change made) . replace state="iowa" if state=="indiana" & county=="audubon" (1 real change made) . replace state="iowa" if state=="indiana" & county=="black hawk" (4 real changes made) . replace state="iowa" if state=="indiana" & county=="bremer" (1 real change made) . replace state="iowa" if state=="indiana" & county=="buchanan" (1 real change made) . replace state="iowa" if state=="indiana" & county=="buena vista" (1 real change made) . replace state="iowa" if state=="indiana" & county=="butler" (1 real change made) . replace state="iowa" if state=="indiana" & county=="calhoun" (1 real change made) . replace state="iowa" if state=="indiana" & county=="cedar" (1 real change made) . replace state="iowa" if state=="indiana" & county=="cerro gordo" (3 real changes made) . replace state="iowa" if state=="indiana" & county=="cherokee" (1 real change made) . replace state="iowa" if state=="indiana" & county=="chickasaw" (1 real change made) . replace state="iowa" if state=="indiana" & county=="clarke" (1 real change made) . replace state="iowa" if state=="indiana" & county=="clayton" (1 real change made) . replace state="iowa" if state=="indiana" & county=="dallas" (1 real change made) . replace state="iowa" if state=="indiana" & county=="davis" (1 real change made) . replace state="iowa" if state=="indiana" & county=="des moines" (3 real changes made) . replace state="iowa" if state=="indiana" & county=="dickinson" (1 real change made) . replace state="iowa" if state=="indiana" & county=="dubuque" (3 real changes made) . replace state="iowa" if state=="indiana" & county=="emmet" (1 real change made) . replace state="iowa" if state=="indiana" & county=="fremont" (1 real change made) . replace state="iowa" if state=="indiana" & county=="adams" & births__total== > 142 (1 real change made) . replace state="iowa" if state=="indiana" & county=="benton" & births__total= > =512 (1 real change made) . replace state="iowa" if state=="indiana" & county=="boone" & births__total== > 494 (1 real change made) . replace state="iowa" if state=="indiana" & county=="boone" & births__total== > 252 (1 real change made) . replace state="iowa" if state=="indiana" & county=="boone" & births__total== > 242 (1 real change made) . replace state="iowa" if state=="indiana" & county=="cass" & births__total==3 > 86 (1 real change made) . replace state="iowa" if state=="indiana" & county=="clay" & births__total==4 > 30 (1 real change made) . replace state="iowa" if state=="indiana" & county=="clinton" & births__total > ==1264 (1 real change made) . replace state="iowa" if state=="indiana" & county=="clinton" & births__total > ==744 (1 real change made) . replace state="iowa" if state=="indiana" & county=="clinton" & births__total > ==520 (1 real change made) . replace state="iowa" if state=="indiana" & county=="crawford" & births__tota > l==436 (1 real change made) . replace state="iowa" if state=="indiana" & county=="decatur" & births__total > ==192 (1 real change made) . replace state="iowa" if state=="indiana" & county=="delaware" & births__tota > l==532 (1 real change made) . replace state="iowa" if state=="indiana" & county=="fayette" & births__total > ==628 (1 real change made) . replace state="iowa" if state=="indiana" & county=="floyd" & births__total== > 492 (1 real change made) . replace state="iowa" if state=="indiana" & county=="floyd" & births__total== > 210 (1 real change made) . replace state="iowa" if state=="indiana" & county=="floyd" & births__total== > 282 (1 real change made) . replace state="iowa" if state=="indiana" & county=="franklin" & births__tota > l==316 (1 real change made) . replace state="iowa" if state=="indiana" & county=="greene" & births__total= > =312 (1 real change made) . replace state="iowa" if state=="indiana" & county=="carroll" & births__total > ==684 (1 real change made) . replace state="wisconsin" if state=="west virginia" & county=="adams" (1 real change made) . replace state="wisconsin" if state=="west virginia" & county=="ashland" (5 real changes made) . replace state="wisconsin" if state=="west virginia" & county=="barron" (1 real change made) . replace state="wisconsin" if state=="west virginia" & county=="bayfield" (1 real change made) . replace state="wisconsin" if state=="west virginia" & county=="brown" (3 real changes made) . replace state="wisconsin" if state=="west virginia" & county=="buffalo" (1 real change made) . replace state="wisconsin" if state=="west virginia" & county=="burnett" (1 real change made) . replace state="wisconsin" if state=="west virginia" & county=="calumet" (1 real change made) . replace state="wisconsin" if state=="west virginia" & county=="chippewa" (4 real changes made) . replace state="wisconsin" if state=="west virginia" & county=="clark" (1 real change made) . replace state="wisconsin" if state=="west virginia" & county=="columbia" (1 real change made) . replace state="wisconsin" if state=="west virginia" & county=="crawford" (1 real change made) . replace state="wisconsin" if state=="west virginia" & county=="dane" (3 real changes made) . replace state="wisconsin" if state=="west virginia" & county=="dodge" (4 real changes made) . replace state="wisconsin" if state=="west virginia" & county=="door" (1 real change made) . replace state="wisconsin" if state=="west virginia" & county=="douglas" (3 real changes made) . replace state="wisconsin" if state=="west virginia" & county=="dunn" (1 real change made) . replace state="wisconsin" if state=="west virginia" & county=="eau claire" (4 real changes made) . replace state="wisconsin" if state=="west virginia" & county=="florence" (1 real change made) . replace state="wisconsin" if state=="west virginia" & county=="fond du lac" (3 real changes made) . replace state="wisconsin" if state=="west virginia" & county=="forest" (1 real change made) . replace state="wisconsin" if state=="west virginia" & county=="green lake" (1 real change made) . replace state="wisconsin" if state=="west virginia" & county=="green" (1 real change made) . replace state="wisconsin" if state=="west virginia" & county=="iowa" (1 real change made) . replace state="wisconsin" if state=="west virginia" & county=="iron" (1 real change made) . replace state="wisconsin" if state=="west virginia" & county=="juneau" (1 real change made) . replace state="wisconsin" if state=="west virginia" & county=="kenosha" (3 real changes made) . replace state="wisconsin" if state=="west virginia" & county=="kewaunee" (1 real change made) . replace state="wisconsin" if state=="west virginia" & county=="la crosse" (3 real changes made) . replace state="wisconsin" if state=="west virginia" & county=="grant" & birt > hs__total==1194 (1 real change made) . replace state="wisconsin" if state=="west virginia" & county=="jackson" & bi > rths__total==320 (1 real change made) . replace state="wisconsin" if state=="west virginia" & county=="jefferson" & > births__total==1134 (1 real change made) . replace state="wisconsin" if state=="west virginia" & county=="jefferson" & > births__total==234 (1 real change made) . replace state="wisconsin" if state=="west virginia" & county=="jefferson" & > births__total==288 (1 real change made) . replace state="wisconsin" if state=="west virginia" & county=="jefferson" & > births__total==900 (1 real change made) . replace state="louisiana" if state=="kentucky" & county=="acadia" (9 real changes made) . replace state="louisiana" if state=="kentucky" & county=="ascension" (3 real changes made) . replace state="louisiana" if state=="kentucky" & county=="allen" & births__t > otal==580 (1 real change made) . replace state="louisiana" if state=="kentucky" & county=="allen" & births__t > otal==436 (1 real change made) . replace state="louisiana" if state=="kentucky" & county=="allen" & births__t > otal==144 (1 real change made) . replace county="mclennan" if state=="texas" & county=="waco" (6 real changes made) . replace city_balance_total="waco" if state=="texas" & county=="mclennan" & b > irths__total==544 (1 real change made) . replace city_balance_total="waco" if state=="texas" & county=="mclennan" & b > irths__total==1970 (1 real change made) . replace city_balance_total="waco" if state=="texas" & county=="mclennan" & b > irths__total==2514 (1 real change made) . replace city_balance_total="san diego" if state=="california" & county=="san > diego" & city_balance_total=="total" & births__total==13198 (1 real change made) . replace city_balance_total="san diego" if state=="california" & county=="san > diego" & city_balance_total=="total" & births__total==1460 (1 real change made) . replace state="iowa" if state=="indiana" & births__total==64616 (1 real change made) . replace state="iowa" if state=="indiana" & births__total==63632 (1 real change made) . replace state="iowa" if state=="indiana" & births__total==984 (1 real change made) . replace state="louisiana" if state=="kentucky" & births__total==90968 (1 real change made) . replace state="louisiana" if state=="kentucky" & births__total==55296 (1 real change made) . replace state="louisiana" if state=="kentucky" & births__total==35672 (1 real change made) . replace city_balance_total="total" if state=="new jersey" & county=="essex" > & city_balance_total=="balance of county" & births__total==20028 (1 real change made) . replace city_balance_total="total" if state=="new jersey" & county=="essex" > & city_balance_total=="balance of county" & births__total==14102 (1 real change made) . replace city_balance_total="total" if state=="new jersey" & county=="essex" > & city_balance_total=="balance of county" & births__total==5926 (1 real change made) . replace city_balance_total="greenville" if state=="texas" & county=="hunt" & > births__total==334 (1 real change made) . replace city_balance_total="greenville" if state=="texas" & county=="hunt" & > births__total==64 (1 real change made) . replace city_balance_total="balance of county" if state=="texas" & county==" > hunt" & births__total==300 (1 real change made) . replace city_balance_total="balance of county" if state=="texas" & county==" > hunt" & births__total==52 (1 real change made) . replace city_balance_total="beaumont" if state=="texas" & county=="jefferson > " & births__total==1952 (1 real change made) . replace city_balance_total="beaumont" if state=="texas" & county=="jefferson > " & births__total==1022 (1 real change made) . replace city_balance_total="port arthur" if state=="texas" & county=="jeffer > son" & births__total==1006 (1 real change made) . replace city_balance_total="port arthur" if state=="texas" & county=="jeffer > son" & births__total==522 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="total" & births__total==9300 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="white" & births__total==6610 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="nonwhite" & births__total==2690 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="total" & births__total==4994 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="white" & births__total==2220 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="nonwhite" & births__total==2774 (1 real change made) . replace county="roanoke (ind. city)" if county=="roanoke" & state=="virginia > " & race=="total" & births__total==2020 (1 real change made) . replace state="wisconsin" if state=="west virginia" & births__total==94934 (1 real change made) . replace state="wisconsin" if state=="west virginia" & births__total==3698 (1 real change made) . . *fix county name misspellings (relative to county names in the 1970 census) . replace county=subinstr(county,"st.","st",.) (90 real changes made) . replace county="baltimore city" if county=="baltimore (city)" & state=="mary > land" (3 real changes made) . replace county="brooke" if state=="west virginia" & county=="brocke" (3 real changes made) . replace county="cochrane" if county=="cochran" & state=="texas" (1 real change made) . replace county="desoto" if county=="de soto" & state=="florida" (3 real changes made) . replace county="dickinson" if state=="kansas" & county=="dickincon" (1 real change made) . replace county="du page" if state=="illinois" & county=="dul page" (5 real changes made) . replace county="morgan" if state=="west virginia" & county=="morgun" (1 real change made) . replace county="mclean" if state=="north dakota" & county=="mciean" (1 real change made) . replace county="neosho" if state=="kansas" & county=="necsho" (3 real changes made) . replace county="orleans" if county=="orleans, coextensive with new orleans" > & state=="louisiana" (3 real changes made) . replace county="san francisco" if county=="san francisco, coextensive with s > an francisco (city)" & state=="california" (3 real changes made) . replace county="somerset" if state=="maryland" & county=="scmerset" (3 real changes made) . replace county="st louis city" if county=="st louis (city)" & state=="missou > ri" (3 real changes made) . replace county="st genevieve" if state=="missouri" & county=="ste genevieve" (1 real change made) . replace county="philadelphia" if county=="philadelphia coextensive with phil > adelphia (city)" & state=="pennsylvania" (3 real changes made) . . *correct data entry errors found while checking that white+nonwhite=total . replace births__total=1898 if births__total==1899 & state=="florida" > & county=="hillsborough" & race=="nonwhite" & city_b > alance_total=="total" (1 real change made) . . *check that county names are consistent with 1970 census, except for known d > eviations . preserve . egen countyr_state=concat(county state), punct("_") . sort countyr_state . merge countyr_state using original_census1970_counties variable countyr_state does not uniquely identify observations in the master data variable countyr_state does not uniquely identify observations in original_census1970_counties.dta . assert _m==3|county=="armstrong" & state=="south dakota"|county=="washington > " & state=="south dakota"|county=="yellowstone national park (part)" & state > =="idaho"|county=="yellowstone national park" & state=="idaho"|county=="yell > owstone national park (part)" & state=="montana"|county=="yellowstone nation > al park (part)" & state=="wyoming"|county=="yellowstone national park (total > )" & state=="wyoming"|county=="yellowstone national park" & state=="montana" > |county=="park (excl yell nat park)" & state=="montana"|county=="yellowstone > national park, part" & state=="montana"|county=="yellowstone national park, > part" & state=="wyoming"|county=="yellowstone national park, part" & state= > ="montana"|county=="yellowstone national park, total" & state=="wyoming"|cou > nty=="park" & state=="montana"|county=="yellowstone nat. park (part)" & stat > e=="idaho"|county=="yellowstone nat. park (part)" & state=="wyoming"|county= > ="yellowstone nat. park (total)" & state=="wyoming"|county=="new york city" > & state=="new york"|county=="bronx" & state=="new york"|county=="kings" & st > ate=="new york"|county=="new york" & state=="new york"|county=="queens" & st > ate=="new york"|county=="richmond" & state=="new york"|county=="ormsby" & st > ate=="nevada"|county=="carson city" & state=="nevada"|county=="los alamos" & > state=="new mexico"|county=="menominee" & state=="wisconsin"|county=="total > "|state=="alaska"|state=="hawaii"|state=="virginia"|state=="dc" . restore . . *check for mistakes/misspellings in state names . assert state=="alabama"|state=="alaska"|state=="arizona"|state=="arkansas"|s > tate=="california"|state=="colorado"|state=="connecticut"|state=="delaware"| > state=="district of columbia"|state=="florida"|state=="georgia"|state=="hawa > ii"|state=="idaho"|state=="illinois"|state=="indiana"|state=="iowa"|state==" > kansas"|state=="kentucky"|state=="louisiana"|state=="maine"|state=="maryland > "|state=="massachusetts"|state=="michigan"|state=="minnesota"|state=="missis > sippi"|state=="missouri"|state=="montana"|state=="nebraska"|state=="nevada"| > state=="new hampshire"|state=="new jersey"|state=="new mexico"|state=="new y > ork"|state=="north carolina"|state=="north dakota"|state=="ohio"|state=="okl > ahoma"|state=="oregon"|state=="pennsylvania"|state=="rhode island"|state=="s > outh carolina"|state=="south dakota"|state=="tennessee"|state=="texas"|state > =="utah"|state=="vermont"|state=="virginia"|state=="washington"|state=="west > virginia"|state=="wisconsin"|state=="wyoming" . . *data checks for columns summing to county total . *not applicable; see footnote 1 . . *clean and label variables . rename page__of_pdf_ page_of_pdf_ . label var page_of_pdf_ "page of pdf" . label var state "state" . label var county "county" . rename city_balance_total sub_county . label var sub_county "city/balance/total" . label var race "race" . rename births__total births . label var births "births by residence" . rename births_of_residents_of_area__att births_h_p . label var births_h_p "births by residence: physician in hospital" . rename births_of_residents_of_area__at0 births_nh_p . label var births_nh_p "births by residence: physician not in hospital" . rename births_of_residents_of_area__at1 births_m . label var births_m "births by residence: midwife" . . *generate year variable . gen year=1959 . label var year "year" . . *check that observations are unique . egen tag=tag(state county sub_county race) . assert tag==1 . drop tag . . list in 1/1 +--------------------------------------------------------------------+ 1. | page_o~_ | state | county | sub_co~y | race | births | bir~_h_p | | 95 | alabama | total | total | total | 82328 | 66722 | |--------------------------------------------------------------------| | bir~nh_p | births_m | year | | 2900 | 12496 | 1959 | +--------------------------------------------------------------------+ . desc Contains data from natality1959.dta obs: 8,418 vars: 10 size: 1,128,012 (89.2% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page_of_pdf_ int %8.0g page of pdf state str20 %20s state county str52 %52s county sub_county str32 %32s city/balance/total race str8 %8s race births long %12.0g births by residence births_h_p long %12.0g births by residence: physician in hospital births_nh_p int %8.0g births by residence: physician not in hospital births_m int %8.0g births by residence: midwife year float %9.0g year ------------------------------------------------------------------------------ > - Sorted by: Note: dataset has changed since last saved . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page_of_pdf_ | 8418 145.8288 29.50249 95 197 state | 0 county | 0 sub_county | 0 race | 0 -------------+-------------------------------------------------------- births | 8418 2333.469 13069.4 0 360820 births_h_p | 8418 2250.353 12802.99 0 357556 births_nh_p | 8418 30.488 185.8093 0 4982 births_m | 8417 48.28205 435.8314 0 14720 year | 8418 1959 0 1959 1959 . saveold clean_natality1959.dta,replace file clean_natality1959.dta saved . clear . . ** . *1960 data . ** . *http://nber15.nber.org/vital-stats-books/nat60_1.cv.pdf . *table 3-1 . . *hand-checked excel file for ?'s (difficult-to-read characters) . *corrected below if any . . clear . use natality1960.dta . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page__of_p~_ | 0 state | 0 county | 0 city_balan~l | 0 race | 0 -------------+-------------------------------------------------------- births__to~l | 9293 2205.571 12486.93 2 372210 births_of_~t | 9291 2133.921 12247.29 2 368864 births_of_~0 | 7428 90.30479 585.1939 2 17186 . desc Contains data from natality1960.dta obs: 9,312 vars: 8 size: 1,070,880 (89.8% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page__of_pdf_ str4 %4s state str17 %17s county str44 %44s city_balance_~l str28 %28s race str8 %8s births__total long %12.0g births_of_res~t long %12.0g births_of_res~0 int %8.0g ------------------------------------------------------------------------------ > - Sorted by: . replace county=lower(county) (9310 real changes made) . replace state=lower(state) (9312 real changes made) . replace city_balance_total=lower(city_balance_total) (9312 real changes made) . replace race=lower(race) (9312 real changes made) . . *missing observations in the original data are zeros . *replace these so we can distinguish these from "true" missing values . *which would be generated below due to difficult-to-read characters . count if births__total==. 19 . replace births__total=0 if births__total ==. (19 real changes made) . count if births_of_residents_of_area__att==. 21 . replace births_of_residents_of_area__att =0 if births_of_residents_of_area > __att ==. (21 real changes made) . count if births_of_residents_of_area__at0==. 1884 . replace births_of_residents_of_area__at0 =0 if births_of_residents_of_area > __at0 ==. (1884 real changes made) . . *check that all pdf pages appear to be in the data . gen temp=strpos(page__of_pdf_, "-") . gen newpagenumber=substr(page__of_pdf_,temp+1,.) . destring newpagenumber, replace newpagenumber has all characters numeric; replaced as byte . drop temp . sort newpagenumber . gen temp=newpagenumber[_n]-newpagenumber[_n-1] (1 missing value generated) . assert temp==0|temp==1|temp==. . drop temp newpagenumber . . *clean data entry errors . replace births__total=314 if state=="georgia" & county=="terrell" & city_bal > ance_total=="total" & race=="nonwhite" (1 real change made) . replace county="parmer" if state=="texas" & county=="parker" & births__total > ==242 (1 real change made) . replace county="white" if state=="illinois" & county=="wayne" & births__tota > l==370 (1 real change made) . replace county="white" if state=="indiana" & county=="wells" & births__total > ==464 (1 real change made) . replace births_of_residents_of_area__at0=196 if state=="florida" & county==" > escambia" & city_balance_total=="total" & race=="nonwhite" (1 real change made) . replace births_of_residents_of_area__att=134 if state=="florida" & county==" > putnam" & city_balance_total =="palatka" & race =="nonwhite" (1 real change made) . replace births_of_residents_of_area__att=188 if state=="georgia" & county==" > grady" & city_balance_total =="total" & race =="white" (1 real change made) . replace births__total =312 if state=="georgia" & county=="terrell" & city_ba > lance_total=="total" & race=="nonwhite" (1 real change made) . replace births_of_residents_of_area__at0=2 if state=="illinois" & county=="c > ook" & city_balance_total =="evergreen park" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=66 if state=="louisiana" & county== > "natchitoches" & city_balance_total == "natchitoches" & race=="total" (1 real change made) . replace births__total =320 if state=="louisiana" & county=="tangipahoa" & c > ity_balance_total =="hammond" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=62 if state=="maryland" & county==" > anne arundel" & city_balance_total=="balance of county" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=382 if state=="nevada" & county=="w > ashoe" & city_balance_total=="balance of county" & race=="total" (1 real change made) . replace births__total =6286 if state=="new york" & county=="monroe" & city_b > alance_total=="rochester" & race=="white" (1 real change made) . replace births__total =108 if state=="south dakota" & county=="faulk" & cit > y_balance_total =="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=188 if state=="texas" & county=="an > gelina" & city_balance_total =="total" & race=="nonwhite" (1 real change made) . replace births_of_residents_of_area__att=468 if state=="texas" & county=="br > own" & city_balance_total=="total" & race=="total" (1 real change made) . replace births__total =168 if state=="texas" & county=="panola" & city_balan > ce_total=="total" & race=="white" (1 real change made) . replace births__total=9398 if state=="vermont" & county=="total" & city_bala > nce_total=="total" & race=="white" (1 real change made) . replace births__total =1146 if state=="vermont" & county=="chittenden" & cit > y_balance_total=="balance of county" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=1138 if state=="vermont" & county== > "chittenden" & city_balance_total=="balance of county" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=8 if state=="vermont" & county=="ch > ittenden" & city_balance_total=="balance of county" & race=="total" (1 real change made) . replace births__total =380 if state=="virginia" & county=="fauquier" & city_ > balance_total=="total" & race=="white" (1 real change made) . replace births_of_residents_of_area__att=6416 if state=="south carolina" & c > ounty=="charleston" & city_balance_total=="total" & race=="total" (1 real change made) . replace county="essex" if state=="vermont" & county=="essx" (1 real change made) . replace births__total =160 if state=="vermont" & county=="essex" & city_bal > ance_total =="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=156 if state=="vermont" & county==" > essex" & city_balance_total =="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=4 if state=="vermont" & county=="es > sex" & city_balance_total =="total" & race=="total" (0 real changes made) . replace county="fairfax" if state=="virginia" & county=="fsirfis" (8 real changes made) . replace births__total=318 if state=="virginia" & county=="fairfax" & city_ba > lance_total=="balance of county" & race=="nonwhite" (1 real change made) . replace city_balance_total="balance of county" if state=="texas" & county==" > collin" & city_balance_total=="total" & race=="white" & births__total==518 (1 real change made) . replace city_balance_total="balance of county" if state=="texas" & county==" > collin" & city_balance_total=="total" & race=="nonwhite" & births__total==72 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="total" & births__total==8554 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="white" & births__total==6030 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="nonwhite" & births__total==2524 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="total" & births__total==4830 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="white" & births__total==2234 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="nonwhite" & births__total==2596 (1 real change made) . replace county="roanoke (ind. city)" if county=="roanoke" & state=="virginia > " & race=="total" & births__total==2028 (1 real change made) . replace city_balance_total="sandusky" if state=="ohio" & county=="erie" & ci > ty_balance_total=="total" & race=="white" & births__total==690 (1 real change made) . replace city_balance_total="sandusky" if state=="ohio" & county=="erie" & ci > ty_balance_total=="total" & race=="nonwhite" & births__total==136 (1 real change made) . . *fix county name misspellings (relative to county names in the 1970 census) . replace county=subinstr(county,"st.","st",.) (206 real changes made) . replace county=subinstr(county,"ste.","st",.) (1 real change made) . replace county="okfuskee" if state=="oklahoma" & births__total==76 & births_ > of_residents_of_area__att==52 (1 real change made) . replace county="okfuskee" if state=="oklahoma" & births__total==140 & births > _of_residents_of_area__att==136 (1 real change made) . replace county="adams" if state=="colorado" & county=="adahs" (6 real changes made) . replace county="baldwin" if state=="georgia" & county=="balowin" (9 real changes made) . replace county="baylor" if state=="texas" & county=="bayldr" (1 real change made) . replace county="boone" if state=="west virginia" & county=="bcone" (1 real change made) . replace county="beauregard" if state=="louisiana" & county=="bealregard" (3 real changes made) . replace county="bladen" if state=="north carolina" & county=="blanden" (3 real changes made) . replace county="bosque" if state=="texas" & county=="bosgue" (1 real change made) . replace county="bowie" if state=="texas" & county=="bowte" (9 real changes made) . replace county="catoosa" if state=="georgia" & county=="catodsa" (1 real change made) . replace county="cochrane" if county=="cochran" & state=="texas" (1 real change made) . replace county="colquitt" if state=="georgia" & county=="colouitt" (9 real changes made) . replace county="dane" if state=="wisconsin" & county=="dame" (3 real changes made) . replace county="de witt" if state=="texas" & county=="de sitt" (3 real changes made) . replace county="desoto" if county=="de soto" & state=="florida" (3 real changes made) . replace county="denver" if county=="denver, coex. with denver city" & state= > ="colorado" (3 real changes made) . replace county="floyd" if state=="texas" & county=="floyo" (1 real change made) . replace county="fort bend" if state=="texas" & county=="fort bead" (3 real changes made) . replace county="edwards" if state=="texas" & county=="edxaeds" (1 real change made) . replace county="franklin" if state=="vermont" & county=="franslin" (1 real change made) . replace county="hardeman" if state=="texas" & county=="haroeman" (3 real changes made) . replace county="hillsdale" if state=="michigan" & county=="hillsoale" (1 real change made) . replace county="hopkins" if state=="kentucky" & county=="hopins" (5 real changes made) . replace county="iroquois" if state=="illinois" & county=="iroguois" (1 real change made) . replace county="mcclain" if state=="oklahoma" & county=="mc clain" (1 real change made) . replace county="mccone" if state=="montana" & county=="mc cone" (1 real change made) . replace county="mccook" if state=="south dakota" & county=="mc cook" (1 real change made) . replace county="mccormick" if state=="south carolina" & county=="mc cormick" (3 real changes made) . replace county="mccracken" if state=="kentucky" & county=="mc cracken" (7 real changes made) . replace county="mccreary" if state=="kentucky" & county=="mc creary" (1 real change made) . replace county="mcculloch" if state=="texas" & county=="mc culloch" (1 real change made) . replace county="mccurtain" if state=="oklahoma" & county=="mc curtain" (3 real changes made) . replace county="mcdonald" if state=="missouri" & county=="mc donald" (1 real change made) . replace county="mcdonough" if state=="illinois" & county=="mc donough" (3 real changes made) . replace county="mcdowell" if state=="north carolina" & county=="mc dowell" (1 real change made) . replace county="mcdowell" if state=="west virginia" & county=="mc dowell" (3 real changes made) . replace county="mcduffie" if state=="georgia" & county=="mc duffie" (3 real changes made) . replace county="mchenry" if state=="illinois" & county=="mc henry" (1 real change made) . replace county="mchenry" if state=="north dakota" & county=="mc henry" (1 real change made) . replace county="mcintosh" if state=="georgia" & county=="mc intosh" (3 real changes made) . replace county="mcintosh" if state=="north dakota" & county=="mc intosh" (1 real change made) . replace county="mcintosh" if state=="oklahoma" & county=="mc intosh" (3 real changes made) . replace county="mckean" if state=="pennsylvania" & county=="mc kean" (3 real changes made) . replace county="mckenzie" if state=="north dakota" & county=="mc kenzie" (1 real change made) . replace county="mckinley" if state=="new mexico" & county=="mc kinley" (9 real changes made) . replace county="mclean" if state=="illinois" & county=="mc lean" (4 real changes made) . replace county="mclean" if state=="kentucky" & county=="mc lean" (1 real change made) . replace county="mclean" if state=="north dakota" & county=="mc lean" (1 real change made) . replace county="mclennan" if state=="texas" & county=="mc lennan" (9 real changes made) . replace county="mcleod" if state=="minnesota" & county=="mc leod" (1 real change made) . replace county="mcminn" if state=="tennessee" & county=="mc minn" (3 real changes made) . replace county="mcmullen" if state=="texas" & county=="mc mullen" (1 real change made) . replace county="mcnairy" if state=="tennessee" & county=="mc nairy" (1 real change made) . replace county="mcpherson" if state=="kansas" & county=="mc pherson" (1 real change made) . replace county="mcpherson" if state=="nebraska" & county=="mc pherson" (1 real change made) . replace county="mcpherson" if state=="south dakota" & county=="mc pherson" (1 real change made) . replace county="milam" if state=="texas" & county=="milan" (3 real changes made) . replace county="newton" if state=="texas" & county=="nexton" (3 real changes made) . replace county="o'brien" if state=="iowa" & county=="o brien" (1 real change made) . replace county="orleans" if county=="orleans, coex. with new orleans city" & > state=="louisiana" (3 real changes made) . replace county="parker" if state=="texas" & county=="parkcr" (1 real change made) . replace county="philadelphia" if county=="philadelphia, coex. with philadelp > hia city" & state=="pennsylvania" (3 real changes made) . replace county="san francisco" if county=="san francisco, coex. with san fra > ncisco city" & state=="california" (3 real changes made) . replace county="schoolcraft" if state=="michigan" & county=="school craft" (1 real change made) . replace county="seminole" if state=="oklahoma" & county=="semincle" (7 real changes made) . replace county="somerset" if state=="maine" & county=="someret" (1 real change made) . replace county="valencia" if state=="new mexico" & county=="valengia" (7 real changes made) . replace county="wyoming" if state=="west virginia" & county=="wydming" (1 real change made) . replace county="erath" if state=="texas" & county=="esath" (1 real change made) . replace county="fisher" if state=="texas" & county=="fishee" (1 real change made) . replace state="district of columbia" if state=="dist. of columbia" state was str17 now str20 (3 real changes made) . replace county="franklin" if state=="massachusetts" & county=="frankl in" (1 real change made) . . *correct data entry errors found while checking that county totals sum to st > ate totals . replace race = "total" if state=="illinois" & county=="white" & city_balance > _total=="total" (1 real change made) . replace race = "total" if state=="indiana" & county=="white" & city_balance_ > total=="total" (1 real change made) . // changes for essex county, vermont, were made above (they had been changed > , but incorrectly, above) . *check that county names are consistent with 1970 census, except for known d > eviations . preserve . egen countyr_state=concat(county state), punct("_") . sort countyr_state . merge countyr_state using original_census1970_counties variable countyr_state does not uniquely identify observations in the master data variable countyr_state does not uniquely identify observations in original_census1970_counties.dta . assert _m==3|county=="armstrong" & state=="south dakota"|county=="washington > " & state=="south dakota"|county=="yellowstone national park (part)" & state > =="idaho"|county=="yellowstone national park, part" & state=="idaho"|county= > ="yellowstone national park" & state=="idaho"|county=="yellowstone national > park (part)" & state=="montana"|county=="yellowstone national park (part)" & > state=="wyoming"|county=="yellowstone national park (total)" & state=="wyom > ing"|county=="yellowstone national park" & state=="montana"|county=="park (e > xcl yell nat park)" & state=="montana"|county=="yellowstone national park, p > art" & state=="montana"|county=="yellowstone national park, part" & state==" > wyoming"|county=="yellowstone national park, part" & state=="montana"|county > =="yellowstone national park, total" & state=="wyoming"|county=="park" & sta > te=="montana"|county=="yellowstone nat. park (part)" & state=="idaho"|county > =="yellowstone nat. park (part)" & state=="wyoming"|county=="yellowstone nat > . park (total)" & state=="wyoming"|county=="new york city" & state=="new yor > k"|county=="bronx" & state=="new york"|county=="kings" & state=="new york"|c > ounty=="new york" & state=="new york"|county=="queens" & state=="new york"|c > ounty=="richmond" & state=="new york"|county=="ormsby" & state=="nevada"|cou > nty=="carson city" & state=="nevada"|county=="los alamos" & state=="new mexi > co"|county=="menominee" & state=="wisconsin"|county=="total"|state=="alaska" > |state=="hawaii"|state=="virginia"|state=="dc" . restore . . *check for mistakes/misspellings in state names . assert state=="alabama"|state=="alaska"|state=="arizona"|state=="arkansas"|s > tate=="california"|state=="colorado"|state=="connecticut"|state=="delaware"| > state=="district of columbia"|state=="florida"|state=="georgia"|state=="hawa > ii"|state=="idaho"|state=="illinois"|state=="indiana"|state=="iowa"|state==" > kansas"|state=="kentucky"|state=="louisiana"|state=="maine"|state=="maryland > "|state=="massachusetts"|state=="michigan"|state=="minnesota"|state=="missis > sippi"|state=="missouri"|state=="montana"|state=="nebraska"|state=="nevada"| > state=="new hampshire"|state=="new jersey"|state=="new mexico"|state=="new y > ork"|state=="north carolina"|state=="north dakota"|state=="ohio"|state=="okl > ahoma"|state=="oregon"|state=="pennsylvania"|state=="rhode island"|state=="s > outh carolina"|state=="south dakota"|state=="tennessee"|state=="texas"|state > =="utah"|state=="vermont"|state=="virginia"|state=="washington"|state=="west > virginia"|state=="wisconsin"|state=="wyoming" . . *data checks for columns summing to county total . gen temp=births_of_residents_of_area__att+ births_of_residents_of_area__at0 . assert temp==births__total . drop temp . . *clean and label variables . rename page__of_pdf_ page_of_pdf_ . label var page_of_pdf_ "page of pdf" . label var state "state" . label var county "county" . rename city_balance_total sub_county . label var sub_county "city/balance/total" . label var race "race" . rename births__total births . label var births "births by residence" . rename births_of_residents_of_area__att births_h_p . label var births_h_p "births by residence: physician in hospital" . rename births_of_residents_of_area__at0 births_nh_ns . label var births_nh_ns "births by residence: attendant not in hospital and n > ot specified" . . *generate year variable . gen year=1960 . label var year "year" . . *check that observations are unique . egen tag=tag(state county sub_county race) . assert tag==1 . drop tag . . list in 1/1 +--------------------------------------------------------------------+ 1. | page_o~_ | state | county | sub_co~y | race | births | births~p | | 3-1 | alabama | total | total | total | 80846 | 66272 | |--------------------------------------------------------------------| | births~s | year | | 14574 | 1960 | +--------------------------------------------------------------------+ . desc Contains data from natality1960.dta obs: 9,312 vars: 9 size: 1,136,064 (89.2% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page_of_pdf_ str4 %4s page of pdf state str20 %20s state county str44 %44s county sub_county str28 %28s city/balance/total race str8 %8s race births long %12.0g births by residence births_h_p long %12.0g births by residence: physician in hospital births_nh_ns int %8.0g births by residence: attendant not in hospital and not specified year float %9.0g year ------------------------------------------------------------------------------ > - Sorted by: Note: dataset has changed since last saved . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page_of_pdf_ | 0 state | 0 county | 0 sub_county | 0 race | 0 -------------+-------------------------------------------------------- births | 9312 2201.072 12474.59 0 372210 births_h_p | 9312 2129.037 12233.89 0 368864 births_nh_ns | 9312 72.03565 523.9044 0 17186 year | 9312 1960 0 1960 1960 . saveold clean_natality1960.dta,replace file clean_natality1960.dta saved . clear . . ** . *1961 data . ** . *http://nber15.nber.org/vital-stats-books/vsus_1961_1.pdf . *table 3-1 . . *hand-checked excel file for ?'s (difficult-to-read characters) . *corrected below if any . . clear . use natality1961.dta . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page__of_p~_ | 0 state | 0 county | 0 city_balan~l | 0 race | 0 -------------+-------------------------------------------------------- births__to~l | 9297 2208.689 12528.19 2 381606 births_of_~t | 9297 2140.993 12311.48 2 378464 births_of_~0 | 7186 86.76858 562.0893 2 16302 . desc Contains data from natality1961.dta obs: 9,313 vars: 8 size: 1,052,369 (90.0% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page__of_pdf_ str4 %4s state str17 %17s county str43 %43s city_balance_~l str27 %27s race str8 %8s births__total long %12.0g births_of_res~t long %12.0g births_of_res~0 int %8.0g ------------------------------------------------------------------------------ > - Sorted by: . replace county=lower(county) (9313 real changes made) . replace state=lower(state) (9313 real changes made) . replace city_balance_total=lower(city_balance_total) (9313 real changes made) . replace race=lower(race) (9313 real changes made) . . *missing observations in the original data are zeros . *replace these so we can distinguish these from "true" missing values . *which would be generated below due to difficult-to-read characters . count if births__total==. 16 . replace births__total=0 if births__total ==. (16 real changes made) . count if births_of_residents_of_area__att==. 16 . replace births_of_residents_of_area__att =0 if births_of_residents_of_area > __att ==. (16 real changes made) . count if births_of_residents_of_area__at0==. 2127 . replace births_of_residents_of_area__at0=0 if births_of_residents_of_area_ > _at0 ==. (2127 real changes made) . . *check that all pdf pages appear to be in the data . gen temp=strpos(page__of_pdf_,"-") . gen newpagenumber=substr(page__of_pdf_,temp+1,.) . destring newpagenumber, replace newpagenumber has all characters numeric; replaced as byte . drop temp . sort newpagenumber . gen temp=newpagenumber[_n]-newpagenumber[_n-1] (1 missing value generated) . assert temp==0|temp==1|temp==. . drop temp newpagenumber . . *clean data entry errors . replace city_balance_total="balance of city" if state=="texas" & county=="gr > egg" & births__total==272 (1 real change made) . replace city_balance_total="balance of city" if state=="texas" & county=="gr > egg" & births__total==172 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="total" & births__total==8770 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="white" & births__total==6234 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="nonwhite" & births__total==2536 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="total" & births__total==4836 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="white" & births__total==2156 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="nonwhite" & births__total==2680 (1 real change made) . replace county="roanoke (ind. city)" if county=="roanoke" & state=="virginia > " & race=="total" & births__total==1932 (1 real change made) . replace births_of_residents_of_area__at0=730 if state=="alabama" & county==" > dallas" & city_balance_total =="total" & race=="total" (1 real change made) . replace births__total=464 if state=="arkansas" & county=="jefferson" & city_ > balance_total=="pine bluff" & race=="nonwhite" (1 real change made) . replace births_of_residents_of_area__at0=18 if state=="colorado" & county==" > jefferson" & city_balance_total=="balance of county" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=132 if state=="florida" & county==" > leon" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=58 if state=="georgia" & county=="s > palding" & city_balance_total=="balance of county" & race=="nonwhite" (1 real change made) . replace births_of_residents_of_area__at0=266 if state=="georgia" & county==" > sumter" & city_balance_total=="total" & race=="nonwhite" (1 real change made) . replace births__total=6246 if state=="hawaii" & county=="honolulu" & city_ba > lance_total=="honolulu" & race=="nonwhite" (1 real change made) . replace births_of_residents_of_area__att=6234 if state=="hawaii" & county==" > honolulu" & city_balance_total=="honolulu" & race=="nonwhite" (1 real change made) . replace births_of_residents_of_area__at0=12 if state=="hawaii" & county=="ho > nolulu" & city_balance_total=="honolulu" & race=="nonwhite" (1 real change made) . replace births_of_residents_of_area__at0=18 if state=="indiana" & county=="m > adison" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=218 if state=="louisiana" & county= > ="total" & city_balance_total=="total" & race=="white" (1 real change made) . replace births_of_residents_of_area__at0=8 if state=="maine" & county=="cumb > erland" & city_balance_total=="westbrook" & race=="total" (1 real change made) . replace births__total =248 if state=="maryland" & county=="calvert" & city_b > alance_total=="total" & race=="nonwhite" (1 real change made) . replace births_of_residents_of_area__att=3138 if state=="michigan" & county= > ="jackson" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=6 if state=="missouri" & county=="t > exas" & city_balance_total=="total" & race=="total" (1 real change made) . replace births_of_residents_of_area__att=558 if state=="south carolina" & co > unty=="greenville" & city_balance_total=="balance of county" & race=="nonwhi > te" (1 real change made) . . *fix county name misspellings (relative to county names in the 1970 census) . replace county=subinstr(county,"st.","st",.) (188 real changes made) . replace county=subinstr(county,"ste.","st",.) (1 real change made) . replace county="st louis" if state=="missouri" & county=="bellefontaine" (18 real changes made) . replace city_balance_total="bellefontaine neighbors" if state=="missouri" & > city_balance_total=="neighbors" (1 real change made) . replace county="cherokee" if state=="iowa" & county=="cheroxee" (1 real change made) . replace county="catoosa" if state=="georgia" & county=="catodsa" (1 real change made) . replace county="clearfield" if state=="pennsylvania" & county=="clearfielo" (3 real changes made) . replace county="cochrane" if county=="cochran" & state=="texas" (1 real change made) . replace county="colquitt" if state=="georgia" & county=="colouitt" (9 real changes made) . replace county="crawford" if state=="iowa" & county=="crawforo" (1 real change made) . replace county="desoto" if county=="de soto" & state=="florida" (3 real changes made) . replace county="dickinson" if state=="iowa" & county=="dicxinson" (1 real change made) . replace county="flathead" if state=="montana" & county=="fuathead" (3 real changes made) . replace county="george" if state=="mississippi" & county=="gcorge" (3 real changes made) . replace county="guilford" if state=="north carolina" & county=="guilforo" (12 real changes made) . replace county="izard" if state=="arkansas" & county=="izaro" (1 real change made) . replace county="jackson" if state=="iowa" & county=="jacxson" (1 real change made) . replace county="kalkaska" if state=="michigan" & county=="kalxaska" (1 real change made) . replace county="kidder" if state=="north dakota" & county=="kioder" (1 real change made) . replace county="koochiching" if state=="minnesota" & county=="kodchiching" (1 real change made) . replace county="mcclain" if state=="oklahoma" & county=="mc clain" (1 real change made) . replace county="mccone" if state=="montana" & county=="mc cone" (1 real change made) . replace county="mccook" if state=="south dakota" & county=="mc cook" (1 real change made) . replace county="mccormick" if state=="south carolina" & county=="mc cormick" (3 real changes made) . replace county="mccracken" if state=="kentucky" & county=="mc cracken" (7 real changes made) . replace county="mccreary" if state=="kentucky" & county=="mc creary" (1 real change made) . replace county="mcculloch" if state=="texas" & county=="mc culloch" (1 real change made) . replace county="mccurtain" if state=="oklahoma" & county=="mc curtain" (3 real changes made) . replace county="mcdonald" if state=="missouri" & county=="mc donald" (1 real change made) . replace county="mcdonough" if state=="illinois" & county=="mc donough" (3 real changes made) . replace county="mcdowell" if state=="north carolina" & county=="mc dowell" (1 real change made) . replace county="mcdowell" if state=="west virginia" & county=="mc dowell" (3 real changes made) . replace county="mcduffie" if state=="georgia" & county=="mc duffie" (3 real changes made) . replace county="mchenry" if state=="illinois" & county=="mc henry" (1 real change made) . replace county="mchenry" if state=="north dakota" & county=="mc henry" (1 real change made) . replace county="mcintosh" if state=="georgia" & county=="mc intosh" (3 real changes made) . replace county="mcintosh" if state=="north dakota" & county=="mc intosh" (1 real change made) . replace county="mcintosh" if state=="oklahoma" & county=="mc intosh" (3 real changes made) . replace county="mckean" if state=="pennsylvania" & county=="mc kean" (3 real changes made) . replace county="mckenzie" if state=="north dakota" & county=="mc kenzie" (1 real change made) . replace county="mckinley" if state=="new mexico" & county=="mc kinley" (9 real changes made) . replace county="mclean" if state=="illinois" & county=="mc lean" (4 real changes made) . replace county="mclean" if state=="kentucky" & county=="mc lean" (1 real change made) . replace county="mclean" if state=="north dakota" & county=="mc lean" (1 real change made) . replace county="mclennan" if state=="texas" & county=="mc lennan" (9 real changes made) . replace county="mcleod" if state=="minnesota" & county=="mc leod" (1 real change made) . replace county="mcminn" if state=="tennessee" & county=="mc minn" (3 real changes made) . replace county="mcmullen" if state=="texas" & county=="mc mullen" (1 real change made) . replace county="mcnairy" if state=="tennessee" & county=="mc nairy" (1 real change made) . replace county="mcpherson" if state=="kansas" & county=="mc pherson" (1 real change made) . replace county="mcpherson" if state=="nebraska" & county=="mc pherson" (1 real change made) . replace county="mcpherson" if state=="south dakota" & county=="mc pherson" (1 real change made) . replace county="o'brien" if state=="iowa" & county=="o brien" (1 real change made) . replace county="orleans" if county=="orleans, coex. with new orleans city" & > state=="louisiana" (3 real changes made) . replace county="philadelphia" if county=="philadelphia, coex. with philadelp > hia city" & state=="pennsylvania" (3 real changes made) . replace county="polk" if state=="iowa" & county=="polx" (8 real changes made) . replace county="pontotoc" if state=="mississippi" & county=="pontotdc" (3 real changes made) . replace county="presque isle" if state=="michigan" & county=="presoue isle" (1 real change made) . replace county="poweshiek" if state=="iowa" & county=="powesriek" (1 real change made) . replace county="stoddard" if state=="missouri" & county=="stoddaro" (1 real change made) . replace county="montgomery" if state=="tennessee" & county=="tennessee" (6 real changes made) . replace county="winona" if state=="minnesota" & county=="windna" (3 real changes made) . replace county="denver" if county=="denver, coex, with denver city" & state= > ="colorado" (3 real changes made) . replace county="floyd" if state=="iowa" & county=="floyo" (1 real change made) . replace county="jersey" if state=="illinois" & county=="jersev" (1 real change made) . replace county="san francisco" if county=="san francisco, coex with san fran > cisco city" & state=="california" (3 real changes made) . replace county="bayfield" if state=="wisconsin" & county=="bayfielo" (1 real change made) . replace county="parke" if state=="indiana" & county=="parxe" (1 real change made) . replace county="white" if state=="georgia" & county=="wheeler" & births__tot > al==166 (1 real change made) . replace state="district of columbia" if state=="dist. of columbia" state was str17 now str20 (3 real changes made) . . *correct data entry errors found while checking that county totals sum to st > ate totals . replace race = "total" if state=="georgia" & county=="white" & city_balance_ > total=="total" (1 real change made) . . *check that county names are consistent with 1970 census, except for known d > eviations . preserve . egen countyr_state=concat(county state), punct("_") . sort countyr_state . merge countyr_state using original_census1970_counties variable countyr_state does not uniquely identify observations in the master data variable countyr_state does not uniquely identify observations in original_census1970_counties.dta . assert _m==3|county=="armstrong" & state=="south dakota"|county=="washington > " & state=="south dakota"|county=="yellowstone national park (part)" & state > =="idaho"|county=="yellowstone national park, part" & state=="idaho"|county= > ="yellowstone national park" & state=="idaho"|county=="yellowstone national > park (part)" & state=="montana"|county=="yellowstone national park (part)" & > state=="wyoming"|county=="yellowstone national park (total)" & state=="wyom > ing"|county=="yellowstone national park," & state=="wyoming" |county=="yello > wstone national park" & state=="montana"|county=="park (excl yell nat park)" > & state=="montana"|county=="yellowstone national park, part" & state=="mont > ana"|county=="yellowstone national park, part" & state=="wyoming"|county=="y > ellowstone national park, part" & state=="montana"|county=="yellowstone nati > onal park, total" & state=="wyoming"|county=="park" & state=="montana"|count > y=="yellowstone nat. park (part)" & state=="idaho"|county=="yellowstone nat. > park (part)" & state=="wyoming"|county=="yellowstone nat. park (total)" & s > tate=="wyoming"|county=="new york city" & state=="new york"|county=="bronx" > & state=="new york"|county=="kings" & state=="new york"|county=="new york" & > state=="new york"|county=="queens" & state=="new york"|county=="richmond" & > state=="new york"|county=="ormsby" & state=="nevada"|county=="carson city" > & state=="nevada"|county=="los alamos" & state=="new mexico"|county=="menomi > nee" & state=="wisconsin"|county=="total"|state=="alaska"|state=="hawaii"|st > ate=="virginia"|state=="dc" . restore . . *check for mistakes/misspellings in state names . assert state=="alabama"|state=="alaska"|state=="arizona"|state=="arkansas"|s > tate=="california"|state=="colorado"|state=="connecticut"|state=="delaware"| > state=="district of columbia"|state=="florida"|state=="georgia"|state=="hawa > ii"|state=="idaho"|state=="illinois"|state=="indiana"|state=="iowa"|state==" > kansas"|state=="kentucky"|state=="louisiana"|state=="maine"|state=="maryland > "|state=="massachusetts"|state=="michigan"|state=="minnesota"|state=="missis > sippi"|state=="missouri"|state=="montana"|state=="nebraska"|state=="nevada"| > state=="new hampshire"|state=="new jersey"|state=="new mexico"|state=="new y > ork"|state=="north carolina"|state=="north dakota"|state=="ohio"|state=="okl > ahoma"|state=="oregon"|state=="pennsylvania"|state=="rhode island"|state=="s > outh carolina"|state=="south dakota"|state=="tennessee"|state=="texas"|state > =="utah"|state=="vermont"|state=="virginia"|state=="washington"|state=="west > virginia"|state=="wisconsin"|state=="wyoming" . . *data checks for columns summing to county total . gen temp=births_of_residents_of_area__att+ births_of_residents_of_area__at0 . assert temp==births__total . drop temp . . *clean and label variables . rename page__of_pdf_ page_of_pdf_ . label var page_of_pdf_ "page of pdf" . label var state "state" . label var county "county" . rename city_balance_total sub_county . label var sub_county "city/balance/total" . label var race "race" . rename births__total births . label var births "births by residence" . rename births_of_residents_of_area__att births_h_p . label var births_h_p "births by residence: physician in hospital" . rename births_of_residents_of_area__at0 births_nh_ns . label var births_nh_ns "births by residence: attendant not in hospital and n > ot specified" . . *generate year variable . gen year=1961 . label var year "year" . . *check that observations are unique . egen tag=tag(state county sub_county race) . assert tag==1 . drop tag . . list in 1/1 +--------------------------------------------------------------------+ 1. | page_o~_ | state | county | sub_co~y | race | births | births~p | | 3-2 | alabama | total | total | total | 80690 | 66568 | |--------------------------------------------------------------------| | births~s | year | | 14122 | 1961 | +--------------------------------------------------------------------+ . desc Contains data from natality1961.dta obs: 9,313 vars: 9 size: 1,117,560 (89.3% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page_of_pdf_ str4 %4s page of pdf state str20 %20s state county str43 %43s county sub_county str27 %27s city/balance/total race str8 %8s race births long %12.0g births by residence births_h_p long %12.0g births by residence: physician in hospital births_nh_ns int %8.0g births by residence: attendant not in hospital and not specified year float %9.0g year ------------------------------------------------------------------------------ > - Sorted by: Note: dataset has changed since last saved . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page_of_pdf_ | 0 state | 0 county | 0 sub_county | 0 race | 0 -------------+-------------------------------------------------------- births | 9313 2204.897 12517.76 0 381606 births_h_p | 9313 2137.939 12301.28 0 378464 births_nh_ns | 9313 66.95802 495.0834 0 16302 year | 9313 1961 0 1961 1961 . saveold clean_natality1961.dta,replace file clean_natality1961.dta saved . clear . . ** . *1962 data . ** . *http://nber15.nber.org/vital-stats-books/vsus_1962_1.pdf . *table 2-1 . . *hand-checked excel file for ?'s (difficult-to-read characters) . *corrected below if any . . clear . use natality1962.dta . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page__of_p~_ | 0 state | 0 county | 0 city_balan~l | 0 race | 0 -------------+-------------------------------------------------------- births__to~l | 9219 2144.055 12198.75 2 378880 births_of_~t | 9219 2083.257 11999.97 2 376168 births_of_~0 | 6826 81.23586 524.183 2 15196 . desc Contains data from natality1962.dta obs: 9,325 vars: 8 size: 1,091,025 (89.6% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page__of_pdf_ str5 %5s state str20 %20s county str44 %44s city_balance_~l str26 %26s race str8 %8s births__total long %12.0g births_of_res~t long %12.0g births_of_res~0 int %8.0g ------------------------------------------------------------------------------ > - Sorted by: . replace county=lower(county) (9323 real changes made) . replace state=lower(state) (9323 real changes made) . replace city_balance_total=lower(city_balance_total) (9325 real changes made) . replace race=lower(race) (9325 real changes made) . . *missing observations in the original data are zeros . *replace these so we can distinguish these from "true" missing values . *which would be generated below due to difficult-to-read characters . count if births__total==. 106 . replace births__total=0 if births__total ==. (106 real changes made) . count if births_of_residents_of_area__att==. 106 . replace births_of_residents_of_area__att =0 if births_of_residents_of_area > __att ==. (106 real changes made) . count if births_of_residents_of_area__at0==. 2499 . replace births_of_residents_of_area__at0 =0 if births_of_residents_of_area > __at0 ==. (2499 real changes made) . . *check that all pdf pages appear to be in the data . gen temp=strpos(page__of_pdf_, "-") . gen newpagenumber=substr(page__of_pdf_,temp+1,.) (2 missing values generated) . destring newpagenumber, replace newpagenumber has all characters numeric; replaced as byte (2 missing values generated) . drop temp . sort newpagenumber . gen temp=newpagenumber[_n]-newpagenumber[_n-1] (3 missing values generated) . assert temp==0|temp==1|temp==. . drop temp newpagenumber . . *clean data entry errors . drop if state=="" (2 observations deleted) . *drops two empty observations . duplicates report Duplicates in terms of all variables -------------------------------------- copies | observations surplus ----------+--------------------------- 1 | 9321 0 2 | 2 1 -------------------------------------- . duplicates list Duplicates in terms of all variables +----------------------------------------------------------------+ | obs: | page__~_ | state | county | city_b~l | race | births~l | | 8203 | 2-52 | texas | karnes | total | total | 364 | |----------------------------------------------------------------| | births~t | births~0 | | 314 | 50 | +----------------------------------------------------------------+ +----------------------------------------------------------------+ | obs: | page__~_ | state | county | city_b~l | race | births~l | | 8204 | 2-52 | texas | karnes | total | total | 364 | |----------------------------------------------------------------| | births~t | births~0 | | 314 | 50 | +----------------------------------------------------------------+ . duplicates drop Duplicates in terms of all variables (1 observation deleted) . *drops one observation which appears to have been entered twice . replace state="north carolina" if state=="n. carolina" (309 real changes made) . replace state="north carolina" if state=="n.carolina" (139 real changes made) . replace births_of_residents_of_area__att=134 if state=="georgia" & county== > "henry" & city_balance_total =="total" & race=="nonwhite" (1 real change made) . replace births__total=3198 if state=="missouri" & county=="jackson" & city_b > alance_total=="kansas city, total" & race=="nonwhite" (1 real change made) . replace births_of_residents_of_area__att=52 if state=="virginia" & county== > "craig" & city_balance_total=="total" & race=="total" (1 real change made) . replace city_balance_total="balance of county" if state=="alabama" & county= > ="madison" & city_balance_total=="total" & race=="white" & births__total==94 > 6 (1 real change made) . replace city_balance_total="balance of county" if state=="alabama" & county= > ="lee" & city_balance_total=="total" & race=="nonwhite" & births__total==280 (1 real change made) . replace city_balance_total="balance of county" if state=="alabama" & county= > ="madison" & city_balance_total=="total" & race=="nonwhite" & births__total= > =362 (1 real change made) . replace city_balance_total="balance of county" if state=="alabama" & county= > ="lee" & city_balance_total=="total" & race=="white" & births__total==142 (1 real change made) . replace city_balance_total="balance of county" if state=="arizona" & county= > ="coconino" & city_balance_total=="total" & race=="white" & births__total==4 > 00 (1 real change made) . replace city_balance_total="balance of county" if state=="arizona" & county= > ="coconino" & city_balance_total=="total" & race=="nonwhite" & births__total > ==626 (1 real change made) . replace city_balance_total="balance of county" if state=="arkansas" & county > =="mississippi" & city_balance_total=="total" & race=="nonwhite" & births__t > otal==480 (1 real change made) . replace city_balance_total="balance of county" if state=="arkansas" & county > =="mississippi" & city_balance_total=="total" & race=="white" & births__tota > l==982 (1 real change made) . replace city_balance_total="balance of county" if state=="georgia" & county= > ="clayton" & city_balance_total=="total" & race=="white" & births__total==69 > 2 (1 real change made) . replace city_balance_total="balance of county" if state=="georgia" & county= > ="clayton" & city_balance_total=="total" & race=="nonwhite" & births__total= > =80 (1 real change made) . replace city_balance_total="balance of city" if state=="new jersey" & county > =="ocean" & city_balance_total=="total" & race=="total" & births__total==271 > 8 (1 real change made) . replace city_balance_total="galena park" if state=="texas" & county=="harris > " & city_balance_total=="total" & race=="white" & births__total==200 (1 real change made) . replace city_balance_total="galena park" if state=="texas" & county=="harris > " & city_balance_total=="total" & race=="nonwhite" & births__total==46 (1 real change made) . replace city_balance_total="houston" if state=="texas" & county=="harris" & > city_balance_total=="total" & race=="white" & births__total==18640 (1 real change made) . replace city_balance_total="houston" if state=="texas" & county=="harris" & > city_balance_total=="total" & race=="nonwhite" & births__total==7826 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="total" & births__total==8880 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="white" & births__total==6116 (1 real change made) . replace county="norfolk (ind. city)" if county=="norfolk" & state=="virginia > " & race=="nonwhite" & births__total==2764 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="total" & births__total==4762 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="white" & births__total==2182 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="nonwhite" & births__total==2580 (1 real change made) . replace county="roanoke (ind. city)" if county=="roanoke" & state=="virginia > " & race=="total" & births__total==1976 (1 real change made) . replace city_balance_total="seattle" if state=="washington" & county=="king" > & city_balance_total=="total" & race=="white" & births__total==10046 (1 real change made) . replace city_balance_total="seattle" if state=="washington" & county=="king" > & city_balance_total=="total" & race=="nonwhite" & births__total==1458 (1 real change made) . replace county="erie" if state=="new york" & county=="new york" & (city_bala > nce_total=="tonawanda" | city_balance_total=="balance of county") (2 real changes made) . . *fix county name misspellings (relative to county names in the 1970 census) . replace county=subinstr(county,"st.","st",.) (206 real changes made) . replace county=subinstr(county,"ste.","st",.) (1 real change made) . replace county="antrim" if state=="michigan" & county=="antkim" (1 real change made) . replace county="adair" if state=="iowa" & county=="aoair" (1 real change made) . replace county="caldwell" if state=="north carolina" & county=="calowell" (5 real changes made) . replace county="chautauqua" if state=="new york" & county=="chautaudua" (4 real changes made) . replace county="cochrane" if county=="cochran" & state=="texas" (1 real change made) . replace county="de kalb" if state=="illinois" & county=="de kals" (3 real changes made) . replace county="desoto" if county=="de soto" & state=="florida" (3 real changes made) . replace county="denver" if county=="denver, coex. with denver city" & state= > ="colorado" (3 real changes made) . replace county="mcclain" if state=="oklahoma" & county=="mc clain" (1 real change made) . replace county="mccone" if state=="montana" & county=="mc cone" (1 real change made) . replace county="mccook" if state=="south dakota" & county=="mc cook" (1 real change made) . replace county="mccormick" if state=="south carolina" & county=="mc cormick" (3 real changes made) . replace county="mccracken" if state=="kentucky" & county=="mc cracken" (7 real changes made) . replace county="mccreary" if state=="kentucky" & county=="mc creary" (1 real change made) . replace county="mcculloch" if state=="texas" & county=="mc culloch" (1 real change made) . replace county="mccurtain" if state=="oklahoma" & county=="mc curtain" (3 real changes made) . replace county="mcdonald" if state=="missouri" & county=="mc donald" (1 real change made) . replace county="mcdonough" if state=="illinois" & county=="mc donough" (3 real changes made) . replace county="mcdowell" if state=="north carolina" & county=="mc dowell" (1 real change made) . replace county="mcdowell" if state=="west virginia" & county=="mc dowell" (3 real changes made) . replace county="mcduffie" if state=="georgia" & county=="mc duffie" (3 real changes made) . replace county="mchenry" if state=="illinois" & county=="mc henry" (1 real change made) . replace county="mchenry" if state=="north dakota" & county=="mc henry" (1 real change made) . replace county="mcintosh" if state=="georgia" & county=="mc intosh" (3 real changes made) . replace county="mcintosh" if state=="north dakota" & county=="mc intosh" (1 real change made) . replace county="mcintosh" if state=="oklahoma" & county=="mc intosh" (3 real changes made) . replace county="mckean" if state=="pennsylvania" & county=="mc kean" (3 real changes made) . replace county="mckenzie" if state=="north dakota" & county=="mc kenzie" (1 real change made) . replace county="mckinley" if state=="new mexico" & county=="mc kinley" (9 real changes made) . replace county="mclean" if state=="illinois" & county=="mc lean" (4 real changes made) . replace county="mclean" if state=="kentucky" & county=="mc lean" (1 real change made) . replace county="mclean" if state=="north dakota" & county=="mc lean" (1 real change made) . replace county="mclennan" if state=="texas" & county=="mc lennan" (9 real changes made) . replace county="mcleod" if state=="minnesota" & county=="mc leod" (1 real change made) . replace county="mcminn" if state=="tennessee" & county=="mc minn" (3 real changes made) . replace county="mcmullen" if state=="texas" & county=="mc mullen" (1 real change made) . replace county="mcnairy" if state=="tennessee" & county=="mc nairy" (1 real change made) . replace county="mcpherson" if state=="kansas" & county=="mc pherson" (1 real change made) . replace county="mcpherson" if state=="nebraska" & county=="mc pherson" (1 real change made) . replace county="mcpherson" if state=="south dakota" & county=="mc pherson" (1 real change made) . replace county="union" if state=="new jersey" & county=="new jersey" (20 real changes made) . replace county="o'brien" if state=="iowa" & county=="o brien" (1 real change made) . replace county="orleans" if county=="orleans, coex. with new orleans city" & > state=="louisiana" (3 real changes made) . replace county="philadelphia" if county=="philadelphia, coex, with philadelp > hia city" & state=="pennsylvania" (3 real changes made) . replace county="san francisco" if county=="san francisco, coex. with san fra > ncisco city" & state=="california" (3 real changes made) . replace county="santa clara" if state=="california" & county=="santa clar" (12 real changes made) . replace county="wilkinson" if state=="georgia" & county=="wilkenson" (3 real changes made) . replace county="white" if state=="georgia" & county=="wheeler" & births__tot > al==190 (1 real change made) . . *correct data entry errors found while checking that county totals sum to st > ate totals . replace race = "total" if state=="georgia" & county=="white" & city_balance_ > total=="total" (1 real change made) . . *correct data entry errors found while checking that white+nonwhite=total . drop if state=="new jersey" & race!="total" // new jersey did not report > by race in 1962 or 1963 (86 observations deleted) . . *check that county names are consistent with 1970 census, except for known d > eviations . preserve . egen countyr_state=concat(county state), punct("_") . sort countyr_state . merge countyr_state using original_census1970_counties variable countyr_state does not uniquely identify observations in the master data variable countyr_state does not uniquely identify observations in original_census1970_counties.dta . assert _m==3|county=="armstrong" & state=="south dakota"|county=="washington > " & state=="south dakota"|county=="yellowstone national park (part)" & state > =="idaho"|county=="yellowstone national park, part" & state=="idaho"|county= > ="yellowstone national park" & state=="idaho"|county=="yellowstone national > park (part)" & state=="montana"|county=="yellowstone national park (part)" & > state=="wyoming"|county=="yellowstone national park (total)" & state=="wyom > ing"|county=="yellowstone national park," & state=="wyoming" |county=="yello > wstone national park" & state=="montana"|county=="park (excl yell nat park)" > & state=="montana"|county=="yellowstone national park, part" & state=="mont > ana"|county=="yellowstone national park, part" & state=="wyoming"|county=="y > ellowstone national park, part" & state=="montana"|county=="yellowstone nati > onal park, total" & state=="wyoming"|county=="park" & state=="montana"|count > y=="yellowstone nat. park (part)" & state=="idaho"|county=="yellowstone nat. > park (part)" & state=="wyoming"|county=="yellowstone nat. park (total)" & s > tate=="wyoming"|county=="new york city" & state=="new york"|county=="bronx" > & state=="new york"|county=="kings" & state=="new york"|county=="new york" & > state=="new york"|county=="queens" & state=="new york"|county=="richmond" & > state=="new york"|county=="ormsby" & state=="nevada"|county=="carson city" > & state=="nevada"|county=="los alamos" & state=="new mexico"|county=="menomi > nee" & state=="wisconsin"|county=="total"|state=="alaska"|state=="hawaii"|st > ate=="virginia"|state=="dc" . restore . . *check for mistakes/misspellings in state names . assert state=="alabama"|state=="alaska"|state=="arizona"|state=="arkansas"|s > tate=="california"|state=="colorado"|state=="connecticut"|state=="delaware"| > state=="district of columbia"|state=="florida"|state=="georgia"|state=="hawa > ii"|state=="idaho"|state=="illinois"|state=="indiana"|state=="iowa"|state==" > kansas"|state=="kentucky"|state=="louisiana"|state=="maine"|state=="maryland > "|state=="massachusetts"|state=="michigan"|state=="minnesota"|state=="missis > sippi"|state=="missouri"|state=="montana"|state=="nebraska"|state=="nevada"| > state=="new hampshire"|state=="new jersey"|state=="new mexico"|state=="new y > ork"|state=="north carolina"|state=="north dakota"|state=="ohio"|state=="okl > ahoma"|state=="oregon"|state=="pennsylvania"|state=="rhode island"|state=="s > outh carolina"|state=="south dakota"|state=="tennessee"|state=="texas"|state > =="utah"|state=="vermont"|state=="virginia"|state=="washington"|state=="west > virginia"|state=="wisconsin"|state=="wyoming" . . *data checks for columns summing to county total . gen temp=births_of_residents_of_area__att+ births_of_residents_of_area__at0 . assert temp==births__total . drop temp . . *clean and label variables . rename page__of_pdf_ page_of_pdf_ . label var page_of_pdf_ "page of pdf" . label var state "state" . label var county "county" . rename city_balance_total sub_county . label var sub_county "city/balance/total" . label var race "race" . rename births__total births . label var births "births by residence" . rename births_of_residents_of_area__att births_h_p . label var births_h_p "births by residence: physician in hospital" . rename births_of_residents_of_area__at0 births_nh_ns . label var births_nh_ns "births by residence: attendant not in hospital and n > ot specified" . . *generate year variable . gen year=1962 . label var year "year" . . *check that observations are unique . egen tag=tag(state county sub_county race) . assert tag==1 . drop tag . . list in 1/1 +--------------------------------------------------------------------+ 1. | page_o~_ | state | county | sub_co~y | race | births | births~p | | 2-3 | alabama | total | total | total | 78514 | 64972 | |--------------------------------------------------------------------| | births~s | year | | 13542 | 1962 | +--------------------------------------------------------------------+ . desc Contains data from natality1962.dta obs: 9,236 vars: 9 size: 1,117,556 (89.3% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page_of_pdf_ str5 %5s page of pdf state str20 %20s state county str44 %44s county sub_county str26 %26s city/balance/total race str8 %8s race births long %12.0g births by residence births_h_p long %12.0g births by residence: physician in hospital births_nh_ns int %8.0g births by residence: attendant not in hospital and not specified year float %9.0g year ------------------------------------------------------------------------------ > - Sorted by: Note: dataset has changed since last saved . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page_of_pdf_ | 0 state | 0 county | 0 sub_county | 0 race | 0 -------------+-------------------------------------------------------- births | 9236 2139.42 12187.65 0 378880 births_h_p | 9236 2079.387 11989.26 0 376168 births_nh_ns | 9236 60.03313 452.036 0 15196 year | 9236 1962 0 1962 1962 . saveold clean_natality1962.dta,replace file clean_natality1962.dta saved . clear . . ** . *1963 data . ** . *http://nber15.nber.org/vital-stats-books/nat63_1.cv.pdf . *table 2-1 . . *hand-checked excel file for ?'s (difficult-to-read characters) . *corrected below if any . . clear . use natality1963.dta . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page__of_p~_ | 0 state | 0 county | 0 city_balan~l | 0 race | 0 -------------+-------------------------------------------------------- births_by_~o | 9216 2109.52 12067.33 0 381012 births_by_~t | 9213 2054.702 11883.66 2 377944 births_by_~0 | 6756 75.50918 486.6221 2 14736 . desc Contains data from natality1963.dta obs: 9,313 vars: 8 size: 1,098,934 (89.5% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page__of_pdf_ str4 %4s state str20 %20s county str45 %45s city_balance_~l str27 %27s race str8 %8s births_by_pla~o long %12.0g births_by_pla~t long %12.0g births_by_pla~0 int %8.0g ------------------------------------------------------------------------------ > - Sorted by: . replace county=lower(county) (9313 real changes made) . replace state=lower(state) (9313 real changes made) . replace city_balance_total=lower(city_balance_total) (9313 real changes made) . replace race=lower(race) (9313 real changes made) . . *missing observations in the original data are zeros . *replace these so we can distinguish these from "true" missing values . *which would be generated below due to difficult-to-read characters . count if births_by_place_of_residence__to==. 97 . replace births_by_place_of_residence__to=0 if births_by_place_of_residence > __to ==. (97 real changes made) . count if births_by_place_of_residence__at==. 100 . replace births_by_place_of_residence__at =0 if births_by_place_of_residenc > e__at ==. (100 real changes made) . count if births_by_place_of_residence__a0==. 2557 . replace births_by_place_of_residence__a0 =0 if births_by_place_of_residenc > e__a0 ==. (2557 real changes made) . . *check that all pdf pages appear to be in the data . gen temp=strpos(page__of_pdf_,"-") . gen newpagenumber=substr(page__of_pdf_,temp+1,.) . destring newpagenumber, replace newpagenumber has all characters numeric; replaced as byte . drop temp . sort newpagenumber . gen temp=newpagenumber[_n]-newpagenumber[_n-1] (1 missing value generated) . assert temp==0|temp==1|temp==. . drop temp newpagenumber . . *clean data entry errors . replace births_by_place_of_residence__at=1148 if state=="alaska" & county==" > dist. 19, fairbanks" & city_balance_total=="balance of district" & race=="to > tal" (1 real change made) . replace births_by_place_of_residence__a0=90 if state=="texas" & county=="lam > ar" & city_balance_total=="paris" & race=="total" (1 real change made) . replace births_by_place_of_residence__a0=6 if state=="texas" & county=="lama > r" & city_balance_total=="paris" & race=="white" (1 real change made) . replace births_by_place_of_residence__a0=84 if state=="texas" & county=="lam > ar" & city_balance_total=="paris" & race=="nonwhite" (1 real change made) . . *fix county name misspellings (relative to county names in the 1970 census) . replace county=subinstr(county,"st.","st",.) (206 real changes made) . replace county=subinstr(county,"ste.","st",.) (1 real change made) . replace county="cochrane" if county=="cochran" & state=="texas" (1 real change made) . replace county="davidson" if state=="tennessee" & county=="davidson, coex. w > ith nashville city" (3 real changes made) . replace county="desoto" if county=="de soto" & state=="florida" (3 real changes made) . replace county="denver" if county=="denver, coex. with denver city" & state= > ="colorado" (3 real changes made) . replace county="mcclain" if state=="oklahoma" & county=="mc clain" (1 real change made) . replace county="mccone" if state=="montana" & county=="mc cone" (1 real change made) . replace county="mccook" if state=="south dakota" & county=="mc cook" (1 real change made) . replace county="mccormick" if state=="south carolina" & county=="mc cormick" (3 real changes made) . replace county="mccracken" if state=="kentucky" & county=="mc cracken" (7 real changes made) . replace county="mccreary" if state=="kentucky" & county=="mc creary" (1 real change made) . replace county="mcculloch" if state=="texas" & county=="mc culloch" (1 real change made) . replace county="mccurtain" if state=="oklahoma" & county=="mc curtain" (3 real changes made) . replace county="mcdonald" if state=="missouri" & county=="mc donald" (1 real change made) . replace county="mcdonough" if state=="illinois" & county=="mc donough" (3 real changes made) . replace county="mcdowell" if state=="north carolina" & county=="mc dowell" (1 real change made) . replace county="mcdowell" if state=="west virginia" & county=="mc dowell" (3 real changes made) . replace county="mcduffie" if state=="georgia" & county=="mc duffie" (3 real changes made) . replace county="mchenry" if state=="illinois" & county=="mc henry" (1 real change made) . replace county="mchenry" if state=="north dakota" & county=="mc henry" (1 real change made) . replace county="mcintosh" if state=="georgia" & county=="mc intosh" (3 real changes made) . replace county="mcintosh" if state=="north dakota" & county=="mc intosh" (1 real change made) . replace county="mcintosh" if state=="oklahoma" & county=="mc intosh" (3 real changes made) . replace county="mckean" if state=="pennsylvania" & county=="mc kean" (3 real changes made) . replace county="mckenzie" if state=="north dakota" & county=="mc kenzie" (1 real change made) . replace county="mckinley" if state=="new mexico" & county=="mc kinley" (9 real changes made) . replace county="mclean" if state=="illinois" & county=="mc lean" (4 real changes made) . replace county="mclean" if state=="kentucky" & county=="mc lean" (1 real change made) . replace county="mclean" if state=="north dakota" & county=="mc lean" (1 real change made) . replace county="mclennan" if state=="texas" & county=="mc lennan" (9 real changes made) . replace county="mcleod" if state=="minnesota" & county=="mc leod" (1 real change made) . replace county="mcminn" if state=="tennessee" & county=="mc minn" (3 real changes made) . replace county="mcmullen" if state=="texas" & county=="mc mullen" (1 real change made) . replace county="mcnairy" if state=="tennessee" & county=="mc nairy" (1 real change made) . replace county="mcpherson" if state=="kansas" & county=="mc pherson" (1 real change made) . replace county="mcpherson" if state=="nebraska" & county=="mc pherson" (1 real change made) . replace county="mcpherson" if state=="south dakota" & county=="mc pherson" (1 real change made) . replace county="o'brien" if state=="iowa" & county=="o brien" (1 real change made) . replace county="orleans" if county=="orleans, coex. with new orleans city" & > state=="louisiana" (3 real changes made) . replace county="philadelphia" if county=="philadelphia, coex. with philaoelp > hia city" & state=="pennsylvania" (3 real changes made) . replace county="san francisco" if county=="san francisco, coex. with san fra > ncisco city." & state=="california" (3 real changes made) . replace county="fairfax (ind. city)" if county=="fairfax" & state=="virginia > " & race=="total" & births_by_place_of_residence__to==476 (1 real change made) . replace county="franklin (ind. city)" if county=="franklin" & state=="virgin > ia" & race=="total" & births_by_place_of_residence__to==178 (1 real change made) . replace county="franklin (ind. city)" if county=="franklin" & state=="virgin > ia" & race=="white" & births_by_place_of_residence__to==38 (1 real change made) . replace county="franklin (ind. city)" if county=="franklin" & state=="virgin > ia" & race=="nonwhite" & births_by_place_of_residence__to==140 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="total" & births_by_place_of_residence__to==4626 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="white" & births_by_place_of_residence__to==2090 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="nonwhite" & births_by_place_of_residence__to==2536 (1 real change made) . replace county="roanoke (ind. city)" if county=="roanoke" & state=="virginia > " & race=="total" & births_by_place_of_residence__to==1934 (1 real change made) . . *correct data entry errors found while checking that white+nonwhite=total . drop if state=="new jersey" & race!="total" // new jersey did not report > by race in 1962 or 1963 (86 observations deleted) . . *check that county names are consistent with 1970 census, except for known d > eviations . preserve . egen countyr_state=concat(county state), punct("_") . sort countyr_state . merge countyr_state using original_census1970_counties variable countyr_state does not uniquely identify observations in the master data variable countyr_state does not uniquely identify observations in original_census1970_counties.dta . assert _m==3|county=="armstrong" & state=="south dakota"|county=="washington > " & state=="south dakota"|county=="yellowstone national park (part)" & state > =="idaho"|county=="yellowstone national park, part" & state=="idaho"|county= > ="yellowstone national park" & state=="idaho"|county=="yellowstone national > park (part)" & state=="montana"|county=="yellowstone national park (part)" & > state=="wyoming"|county=="yellowstone national park (total)" & state=="wyom > ing"|county=="yellowstone national park," & state=="wyoming" |county=="yello > wstone national park" & state=="montana"|county=="park (excl yell nat park)" > & state=="montana"|county=="yellowstone national park, part" & state=="mont > ana"|county=="yellowstone national park, part" & state=="wyoming"|county=="y > ellowstone national park, part" & state=="montana"|county=="yellowstone nati > onal park, total" & state=="wyoming"|county=="park" & state=="montana"|count > y=="yellowstone nat. park (part)" & state=="idaho"|county=="yellowstone nat. > park (part)" & state=="wyoming"|county=="yellowstone nat. park (total)" & s > tate=="wyoming"|county=="new york city" & state=="new york"|county=="bronx" > & state=="new york"|county=="kings" & state=="new york"|county=="new york" & > state=="new york"|county=="queens" & state=="new york"|county=="richmond" & > state=="new york"|county=="ormsby" & state=="nevada"|county=="carson city" > & state=="nevada"|county=="los alamos" & state=="new mexico"|county=="menomi > nee" & state=="wisconsin"|county=="total"|state=="alaska"|state=="hawaii"|st > ate=="virginia"|state=="dc" . restore . . *check for mistakes/misspellings in state names . assert state=="alabama"|state=="alaska"|state=="arizona"|state=="arkansas"|s > tate=="california"|state=="colorado"|state=="connecticut"|state=="delaware"| > state=="district of columbia"|state=="florida"|state=="georgia"|state=="hawa > ii"|state=="idaho"|state=="illinois"|state=="indiana"|state=="iowa"|state==" > kansas"|state=="kentucky"|state=="louisiana"|state=="maine"|state=="maryland > "|state=="massachusetts"|state=="michigan"|state=="minnesota"|state=="missis > sippi"|state=="missouri"|state=="montana"|state=="nebraska"|state=="nevada"| > state=="new hampshire"|state=="new jersey"|state=="new mexico"|state=="new y > ork"|state=="north carolina"|state=="north dakota"|state=="ohio"|state=="okl > ahoma"|state=="oregon"|state=="pennsylvania"|state=="rhode island"|state=="s > outh carolina"|state=="south dakota"|state=="tennessee"|state=="texas"|state > =="utah"|state=="vermont"|state=="virginia"|state=="washington"|state=="west > virginia"|state=="wisconsin"|state=="wyoming" . . *data checks for columns summing to county total . gen temp=births_by_place_of_residence__at+ births_by_place_of_residence__a0 . assert temp==births_by_place_of_residence__to . drop temp . . *clean and label variables . rename page__of_pdf_ page_of_pdf_ . label var page_of_pdf_ "page of pdf" . label var state "state" . label var county "county" . rename city_balance_total sub_county . label var sub_county "city/balance/total" . label var race "race" . rename births_by_place_of_residence__to births . label var births "births by residence" . rename births_by_place_of_residence__at births_h_p . label var births_h_p "births by residence: physician in hospital" . rename births_by_place_of_residence__a0 births_nh_ns . label var births_nh_ns "births by residence: attendant not in hospital and n > ot specified" . . *generate year variable . gen year=1963 . label var year "year" . . *check that observations are unique . egen tag=tag(state county sub_county race) . assert tag==1 . drop tag . . list in 1/1 +--------------------------------------------------------------------+ 1. | page_o~_ | state | county | sub_co~y | race | births | births~p | | 2-3 | alabama | total | total | total | 76116 | 63870 | |--------------------------------------------------------------------| | births~s | year | | 12246 | 1963 | +--------------------------------------------------------------------+ . desc Contains data from natality1963.dta obs: 9,227 vars: 9 size: 1,125,694 (89.3% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page_of_pdf_ str4 %4s page of pdf state str20 %20s state county str45 %45s county sub_county str27 %27s city/balance/total race str8 %8s race births long %12.0g births by residence births_h_p long %12.0g births by residence: physician in hospital births_nh_ns int %8.0g births by residence: attendant not in hospital and not specified year float %9.0g year ------------------------------------------------------------------------------ > - Sorted by: Note: dataset has changed since last saved . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page_of_pdf_ | 0 state | 0 county | 0 sub_county | 0 race | 0 -------------+-------------------------------------------------------- births | 9227 2107.005 12060.36 0 381012 births_h_p | 9227 2051.708 11874.89 0 377944 births_nh_ns | 9227 55.29685 417.7278 0 14736 year | 9227 1963 0 1963 1963 . saveold clean_natality1963.dta,replace file clean_natality1963.dta saved . clear . . ** . *1964 data . ** . *http://nber15.nber.org/vital-stats-books/nat64_1.cv.pdf . *table 2-1 . . *hand-checked excel file for ?'s (difficult-to-read characters) . *corrected below if any . . clear . use natality1964.dta . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page__of_p~_ | 0 state | 0 county | 0 city_balan~l | 0 race | 0 -------------+-------------------------------------------------------- births__to~l | 9299 2087.504 11877.84 2 374972 births_of_~t | 9296 2035.938 11696.74 2 371800 births_of_~0 | 6708 72.39833 458.7921 2 14066 . desc Contains data from natality1964.dta obs: 9,313 vars: 8 size: 1,098,934 (89.5% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page__of_pdf_ str4 %4s state str20 %20s county str45 %45s city_balance_~l str27 %27s race str8 %8s births__total long %12.0g births_of_res~t long %12.0g births_of_res~0 int %8.0g ------------------------------------------------------------------------------ > - Sorted by: . replace county=lower(county) (9313 real changes made) . replace state=lower(state) (9313 real changes made) . replace city_balance_total=lower(city_balance_total) (9313 real changes made) . replace race=lower(race) (9313 real changes made) . . *missing observations in the original data are zeros . *replace these so we can distinguish these from "true" missing values . *which would be generated below due to difficult-to-read characters . count if births__total==. 14 . replace births__total=0 if births__total ==. (14 real changes made) . count if births_of_residents_of_area__att==. 17 . replace births_of_residents_of_area__att =0 if births_of_residents_of_area > __att ==. (17 real changes made) . count if births_of_residents_of_area__at0==. 2605 . replace births_of_residents_of_area__at0 =0 if births_of_residents_of_area > __at0 ==. (2605 real changes made) . . *check that all pdf pages appear to be in the data . gen temp=strpos(page__of_pdf_, "-") . gen newpagenumber=substr(page__of_pdf_,temp+1,.) . destring newpagenumber, replace newpagenumber has all characters numeric; replaced as byte . drop temp . sort newpagenumber . gen temp=newpagenumber[_n]-newpagenumber[_n-1] (1 missing value generated) . assert temp==0|temp==1|temp==. . drop temp newpagenumber . . *clean data entry errors . replace births_of_residents_of_area__att=518 if state=="new hampshire" & cou > nty=="strafford" & city_balance_total=="balance" & race=="total" (1 real change made) . replace births_of_residents_of_area__at0=2 if state=="new hampshire" & count > y=="strafford" & city_balance_total=="balance" & race=="total" (1 real change made) . replace county="oakland" if state=="michigan" & county=="oceana" & city_bala > nce_total=="balance" & births__total==6286 (1 real change made) . replace county="ottawa" if state=="michigan" & county=="presque isle" & city > _balance_total=="balance" & births__total==1816 (1 real change made) . replace city_balance_total="coffeyville" if state=="kansas" & county=="montg > omery" & city_balance_total=="total" & race=="white" & births__total= > =214 (1 real change made) . replace city_balance_total="coffeyville" if state=="kansas" & county=="montg > omery" & city_balance_total=="total" & race=="nonwhite" & births__total==38 (1 real change made) . replace city_balance_total="balance" if state=="montana" & county=="hill" & > city_balance_total=="total" & race=="white" & births__total==116 (1 real change made) . replace city_balance_total="balance" if state=="montana" & county=="hill" & > city_balance_total=="total" & race=="nonwhite" & births__total==74 (1 real change made) . . *fix county name misspellings (relative to county names in the 1970 census) . replace county=subinstr(county,"st.","st",.) (206 real changes made) . replace county=subinstr(county,"ste.","st",.) (1 real change made) . replace county="cochrane" if county=="cochran" & state=="texas" (1 real change made) . replace county="desoto" if county=="de soto" & state=="florida" (3 real changes made) . replace county="mcclain" if state=="oklahoma" & county=="mc clain" (1 real change made) . replace county="mccormick" if state=="south carolina" & county=="mc cormick" (3 real changes made) . replace county="mccracken" if state=="kentucky" & county=="mc cracken" (7 real changes made) . replace county="mccreary" if state=="kentucky" & county=="mc creary" (1 real change made) . replace county="mcculloch" if state=="texas" & county=="mc culloch" (1 real change made) . replace county="mccurtain" if state=="oklahoma" & county=="mc curtain" (3 real changes made) . replace county="mcdonough" if state=="illinois" & county=="mc donough" (3 real changes made) . replace county="mcdowell" if state=="west virginia" & county=="mc dowell" (3 real changes made) . replace county="mcduffie" if state=="georgia" & county=="mc duffie" (3 real changes made) . replace county="mchenry" if state=="illinois" & county=="mc henry" (1 real change made) . replace county="mchenry" if state=="north dakota" & county=="mc henry" (1 real change made) . replace county="mcintosh" if state=="georgia" & county=="mc intosh" (3 real changes made) . replace county="mcintosh" if state=="north dakota" & county=="mc intosh" (1 real change made) . replace county="mcintosh" if state=="oklahoma" & county=="mc intosh" (3 real changes made) . replace county="mckenzie" if state=="north dakota" & county=="mc kenzie" (1 real change made) . replace county="mckinley" if state=="new mexico" & county=="mc kinley" (9 real changes made) . replace county="mclean" if state=="illinois" & county=="mc lean" (4 real changes made) . replace county="mclean" if state=="kentucky" & county=="mc lean" (1 real change made) . replace county="mclean" if state=="north dakota" & county=="mc lean" (1 real change made) . replace county="mclennan" if state=="texas" & county=="mc lennan" (9 real changes made) . replace county="mcleod" if state=="minnesota" & county=="mc leod" (1 real change made) . replace county="mcminn" if state=="tennessee" & county=="mc minn" (3 real changes made) . replace county="mcmullen" if state=="texas" & county=="mc mullen" (1 real change made) . replace county="mcnairy" if state=="tennessee" & county=="mc nairy" (1 real change made) . replace county="mcpherson" if state=="kansas" & county=="mc pherson" (1 real change made) . replace county="san francisco" if county=="san francisco, coex. with san fra > ncisco city." & state=="california" (3 real changes made) . replace county="floyd" if state=="texas" & county=="floye" (1 real change made) . replace county="jackson" if state=="south dakota" & county=="jackson (+81 in > cludes washabaugh)" (1 real change made) . replace state="virginia" if state=="independent cities" (78 real changes made) . replace county="mcdonald" if state=="missouri" & county=="mc donal d" (1 real change made) . replace city_balance_total="bridgeport" if state=="connecticut" & county=="f > airfield" & births__total==3078 (1 real change made) . replace city_balance_total="bridgeport" if state=="connecticut" & county=="f > airfield" & births__total==752 (1 real change made) . replace city_balance_total="hartford" if state=="connecticut" & county=="har > tford" & births__total==2812 (1 real change made) . replace city_balance_total="hartford" if state=="connecticut" & county=="har > tford" & births__total==1252 (1 real change made) . replace city_balance_total="new haven" if state=="connecticut" & county=="ne > w haven" & births__total==2186 (1 real change made) . replace city_balance_total="new haven" if state=="connecticut" & county=="ne > w haven" & births__total==1076 (1 real change made) . replace city_balance_total="gainesville" if state=="florida" & county=="alac > hua" & births__total==1016 (1 real change made) . replace city_balance_total="gainesville" if state=="florida" & county=="alac > hua" & births__total==366 (1 real change made) . replace city_balance_total="balance" if state=="florida" & county=="alachua" > & births__total==482 (1 real change made) . replace city_balance_total="balance" if state=="florida" & county=="alachua" > & births__total==308 (1 real change made) . replace city_balance_total="panama city" if state=="florida" & county=="bay" > & births__total==580 (1 real change made) . replace city_balance_total="panama city" if state=="florida" & county=="bay" > & births__total==252 (1 real change made) . replace city_balance_total="cocoa" if state=="florida" & county=="brevard" & > births__total==402 (1 real change made) . replace city_balance_total="cocoa" if state=="florida" & county=="brevard" & > births__total==186 (1 real change made) . replace city_balance_total="melbourne" if state=="florida" & county=="brevar > d" & births__total==256 (1 real change made) . replace city_balance_total="melbourne" if state=="florida" & county=="brevar > d" & births__total==136 (1 real change made) . replace city_balance_total="fort lauderdale" if state=="florida" & county==" > broward" & births__total==1392 (1 real change made) . replace city_balance_total="fort lauderdale" if state=="florida" & county==" > broward" & births__total==956 (1 real change made) . replace city_balance_total="hallandale" if state=="florida" & county=="browa > rd" & births__total==102 (1 real change made) . replace city_balance_total="hallandale" if state=="florida" & county=="browa > rd" & births__total==182 (1 real change made) . replace city_balance_total="pompano beach" if state=="florida" & county=="br > oward" & births__total==342 (1 real change made) . replace city_balance_total="pompano beach" if state=="florida" & county=="br > oward" & births__total==460 (1 real change made) . replace city_balance_total="balance" if state=="florida" & county=="broward" > & births__total==2342 (1 real change made) . replace city_balance_total="balance" if state=="florida" & county=="broward" > & births__total==696 (1 real change made) . replace city_balance_total="miami" if state=="florida" & county=="dade" & bi > rths__total==3308 (1 real change made) . replace city_balance_total="miami" if state=="florida" & county=="dade" & bi > rths__total==2426 (1 real change made) . replace city_balance_total="balance" if state=="florida" & county=="dade" & > births__total==7706 (1 real change made) . replace city_balance_total="balance" if state=="florida" & county=="dade" & > births__total==2590 (1 real change made) . replace city_balance_total="jacksonville" if state=="florida" & county=="duv > al" & births__total==2284 (1 real change made) . replace city_balance_total="jacksonville" if state=="florida" & county=="duv > al" & births__total==2328 (1 real change made) . replace city_balance_total="balance" if state=="florida" & county=="marion" > & births__total==452 (1 real change made) . replace city_balance_total="balance" if state=="florida" & county=="marion" > & births__total==428 (1 real change made) . replace city_balance_total="wilmington" if state=="delaware" & county=="new > castle" & births__total==1564 (1 real change made) . replace city_balance_total="wilmington" if state=="delaware" & county=="new > castle" & births__total==970 (1 real change made) . replace city_balance_total="balance" if state=="delaware" & county=="new cas > tle" & births__total==4342 (1 real change made) . replace city_balance_total="balance" if state=="delaware" & county=="new cas > tle" & births__total==346 (1 real change made) . replace city_balance_total="balance" if state=="georgia" & county=="clarke" > & births__total==228 (1 real change made) . replace city_balance_total="balance" if state=="georgia" & county=="clarke" > & births__total==82 (1 real change made) . replace city_balance_total="balance" if state=="georgia" & county=="colquitt > " & births__total==110 (1 real change made) . replace city_balance_total="balance" if state=="georgia" & county=="glynn" & > births__total==574 (1 real change made) . replace city_balance_total="balance" if state=="georgia" & county=="glynn" & > births__total==86 (1 real change made) . replace city_balance_total="junction city" if state=="kansas" & county=="gea > ry" & births__total==612 (1 real change made) . replace city_balance_total="junction city" if state=="kansas" & county=="gea > ry" & births__total==118 (1 real change made) . replace city_balance_total="balance" if state=="georgia" & county=="colquitt > " & city_balance_total=="total" & births__total==232 (1 real change made) . replace county="montgomery" if state=="maryland" & county=="prince georges" > & births__total==326 (1 real change made) . replace county="otsego" if state=="michigan" & county=="ottawa" & births__to > tal==180 (1 real change made) . replace county="roscommon" if state=="michigan" & county=="saginaw" & births > __total==120 (1 real change made) . replace city_balance_total="ecorse" if state=="michigan" & county=="wayne" & > births__total==224 (1 real change made) . replace city_balance_total="highland park" if state=="michigan" & county=="w > ayne" & births__total==356 (1 real change made) . replace county="oscoda" if state=="michigan" & county=="otsego" & births__to > tal==70 (1 real change made) . replace county="presque isle" if state=="michigan" & county=="roscommon" & b > irths__total==258 (1 real change made) . replace county="osceola" if state=="michigan" & county=="oscoda" & births__t > otal==306 (1 real change made) . replace city_balance_total="ecorse" if state=="michigan" & county=="wayne" & > city_balance_total=="total" & race=="nonwhite" & births__total==172 (1 real change made) . replace city_balance_total="highland park" if state=="michigan" & county=="w > ayne" & births__total==292 (1 real change made) . replace county="ontonagon" if state=="michigan" & county=="osceola" & births > __total==206 (1 real change made) . replace county="ogemaw" if state=="michigan" & county=="ontonagon" & births_ > _total==176 (1 real change made) . replace county="oceana" if state=="michigan" & county=="ogemaw" & births__to > tal==314 (1 real change made) . replace city_balance_total="balance" if state=="missouri" & county=="st loui > s" & births__total==9206 (1 real change made) . replace city_balance_total="balance" if state=="missouri" & county=="st loui > s" & births__total==424 (1 real change made) . replace county="yellowstone national park, part" if state=="montana" & count > y=="yellowstone" & births__total==0 (1 real change made) . replace county="park" if state=="montana" & county=="park (excl yell nat par > k)" (1 real change made) . replace county="mcpherson" if state=="nebraska" & county=="madison" & births > __total==10 (1 real change made) . replace city_balance_total="atlantic city" if state=="new jersey" & county== > "atlantic" & births__total==514 (1 real change made) . replace city_balance_total="atlantic city" if state=="new jersey" & county== > "atlantic" & births__total==556 (1 real change made) . replace city_balance_total="pleasantville" if state=="new jersey" & county== > "atlantic" & births__total==242 (1 real change made) . replace city_balance_total="pleasantville" if state=="new jersey" & county== > "atlantic" & births__total==86 (1 real change made) . replace city_balance_total="balance" if state=="new jersey" & county=="cumbe > rland" & births__total==452 (1 real change made) . replace city_balance_total="balance" if state=="new jersey" & county=="cumbe > rland" & births__total==226 (1 real change made) . replace city_balance_total="paterson" if state=="new jersey" & county=="pass > aic" & births__total==2384 (1 real change made) . replace city_balance_total="paterson" if state=="new jersey" & county=="pass > aic" & births__total==1226 (1 real change made) . replace city_balance_total="balance" if state=="north carolina" & county=="c > raven" & births__total==1136 (1 real change made) . replace city_balance_total="balance" if state=="north carolina" & county=="c > raven" & births__total==328 (1 real change made) . replace county="mahoning" if state=="ohio" & county=="madison" & births__tot > al==5096 (1 real change made) . replace city_balance_total="balance" if state=="ohio" & county=="mahoning" & > births__total==2006 (1 real change made) . replace city_balance_total="balance" if state=="ohio" & county=="mahoning" & > births__total==4 (1 real change made) . replace city_balance_total="warren" if state=="ohio" & county=="trumbull" & > births__total==1102 (1 real change made) . replace city_balance_total="warren" if state=="ohio" & county=="trumbull" & > births__total==202 (1 real change made) . replace city_balance_total="ardmore" if state=="oklahoma" & county=="carter" > & births__total==322 (1 real change made) . replace city_balance_total="ardmore" if state=="oklahoma" & county=="carter" > & births__total==102 (1 real change made) . replace city_balance_total="lawton" if state=="oklahoma" & county=="comanche > " & births__total==2144 (1 real change made) . replace city_balance_total="lawton" if state=="oklahoma" & county=="comanche > " & births__total==396 (1 real change made) . replace city_balance_total="balance" if state=="oklahoma" & county=="comanch > e" & births__total==270 (1 real change made) . replace city_balance_total="balance" if state=="oklahoma" & county=="comanch > e" & births__total==86 (1 real change made) . replace city_balance_total="sapulpa" if state=="oklahoma" & county=="creek" > & births__total==220 (1 real change made) . replace city_balance_total="sapulpa" if state=="oklahoma" & county=="creek" > & births__total==46 (1 real change made) . replace city_balance_total="balance" if state=="oklahoma" & county=="muskoge > e" & births__total==292 (1 real change made) . replace city_balance_total="balance" if state=="oklahoma" & county=="muskoge > e" & births__total==86 (1 real change made) . replace city_balance_total="oklahoma city, total" if state=="oklahoma" & cou > nty=="oklahoma" & births__total==7054 (1 real change made) . replace city_balance_total="balance" if state=="oklahoma" & county=="oklahom > a" & births__total==642 (1 real change made) . replace city_balance_total="balance" if state=="oklahoma" & county=="oklahom > a" & births__total==176 (1 real change made) . replace city_balance_total="balance" if state=="oklahoma" & county=="seminol > e" & births__total==194 (1 real change made) . replace city_balance_total="balance" if state=="oklahoma" & county=="seminol > e" & births__total==116 (1 real change made) . replace city_balance_total="tulsa, total" if state=="oklahoma" & county=="tu > lsa" & births__total==4720 (1 real change made) . replace city_balance_total="tulsa, total" if state=="oklahoma" & county=="tu > lsa" & births__total==910 (1 real change made) . replace city_balance_total="balance" if state=="oklahoma" & county=="tulsa" > & births__total==1350 (1 real change made) . replace city_balance_total="balance" if state=="oklahoma" & county=="tulsa" > & births__total==80 (1 real change made) . replace city_balance_total="braddock" if state=="pennsylvania" & county=="al > legheny" & births__total==98 & city_balance_total=="total" (1 real change made) . replace city_balance_total="braddock" if state=="pennsylvania" & county=="al > legheny" & births__total==88 (1 real change made) . replace city_balance_total="shelbyville" if state=="tennessee" & county=="be > dford" & births__total==232 (1 real change made) . replace city_balance_total="shelbyville" if state=="tennessee" & county=="be > dford" & births__total==56 (1 real change made) . replace city_balance_total="chattanooga" if state=="tennessee" & county=="ha > milton" & births__total==1604 (1 real change made) . replace city_balance_total="chattanooga" if state=="tennessee" & county=="ha > milton" & births__total==1116 (1 real change made) . replace city_balance_total="jackson" if state=="tennessee" & county=="madiso > n" & births__total==734 (1 real change made) . replace city_balance_total="jackson" if state=="tennessee" & county=="madiso > n" & births__total==414 (1 real change made) . replace city_balance_total="columbia" if state=="tennessee" & county=="maury > " & births__total==348 (1 real change made) . replace city_balance_total="columbia" if state=="tennessee" & county=="maury > " & births__total==130 (1 real change made) . replace city_balance_total="balance" if state=="tennessee" & county=="maury" > & births__total==422 (1 real change made) . replace city_balance_total="oklahoma city, total" if state=="oklahoma" & cou > nty=="oklahoma" & births__total==1602 (1 real change made) . replace county="marion" if state=="ohio" & county=="mahoning" & births__tota > l==1308 (1 real change made) . replace county="medina" if state=="ohio" & county=="marion" & births__total= > =1420 (1 real change made) . replace race="white" if state=="pennsylvania" & county=="allegheny" & city_b > alance_total=="braddock" & race=="nonwhite" & births__total==98 (0 real changes made) . replace race="white" if state=="tennessee" & county=="madison" & city_balanc > e_total=="jackson" & race=="total" & births__total==414 (1 real change made) . replace city_balance_total="balance" if state=="tennessee" & county=="madiso > n" & births__total==298 (1 real change made) . replace city_balance_total="balance" if state=="tennessee" & county=="madiso > n" & births__total==304 (1 real change made) . replace race="white" if state=="tennessee" & county=="maury" & births__total > ==684 & births_of_residents_of_area__att==670 (1 real change made) . replace race="nonwhite" if state=="tennessee" & county=="maury" & births__to > tal==216 & births_of_residents_of_area__att==178 (1 real change made) . replace city_balance_total="columbia" if state=="tennessee" & county=="maury > " & births__total==478 & births_of_residents_of_area__att==452 (1 real change made) . replace race="total" if state=="tennessee" & county=="maury" & births__total > ==478 & births_of_residents_of_area__att==452 (1 real change made) . replace city_balance_total="columbia" if state=="tennessee" & county=="maury > " & births__total==348 & births_of_residents_of_area__att==346 (0 real changes made) . replace race="white" if state=="tennessee" & county=="maury" & births__total > ==348 & births_of_residents_of_area__att==346 (1 real change made) . replace city_balance_total="columbia" if state=="tennessee" & county=="maury > " & births__total==130 & births_of_residents_of_area__att==106 (0 real changes made) . replace race="nonwhite" if state=="tennessee" & county=="maury" & births__to > tal==130 & births_of_residents_of_area__att==106 (1 real change made) . replace city_balance_total="balance" if state=="tennessee" & county=="maury" > & city_balance_total=="total" & race=="nonwhite" & births__total==422 & bir > ths_of_residents_of_area__att==396 (0 real changes made) . replace race="total" if state=="tennessee" & county=="maury" & city_balance_ > total=="balance" & births__total==422 & births_of_residents_of_area__att==39 > 6 (1 real change made) . replace city_balance_total="balance" if state=="tennessee" & county=="maury" > & city_balance_total=="columbia" & race=="total" & births__total==336 & bir > ths_of_residents_of_area__att==324 (1 real change made) . replace race="white" if state=="tennessee" & county=="maury" & city_balance_ > total=="balance" & births__total==336 & births_of_residents_of_area__att==32 > 4 (1 real change made) . replace city_balance_total="balance" if state=="tennessee" & county=="maury" > & city_balance_total=="columbia" & race=="white" & births__total==86 & birt > hs_of_residents_of_area__att==72 (1 real change made) . replace race="nonwhite" if state=="tennessee" & county=="maury" & city_balan > ce_total=="balance" & births__total==86 & births_of_residents_of_area__att== > 72 (1 real change made) . replace city_balance_total="balance" if state=="tennessee" & county=="ruther > ford" & births__total==804 (1 real change made) . replace city_balance_total="balance" if state=="tennessee" & county=="ruther > ford" & births__total==126 (1 real change made) . replace city_balance_total="waxahachie" if state=="texas" & county=="ellis" > & births__total==174 (1 real change made) . replace city_balance_total="waxahachie" if state=="texas" & county=="ellis" > & births__total==44 (1 real change made) . replace city_balance_total="balance" if state=="texas" & county=="ellis" & b > irths__total==420 (1 real change made) . replace city_balance_total="balance" if state=="texas" & county=="ellis" & b > irths__total==204 (1 real change made) . replace city_balance_total="balance" if state=="virginia" & county=="fairfax > " & births__total==308 (1 real change made) . replace city_balance_total="seattle" if state=="washington" & county=="king" > & births__total==8060 (1 real change made) . replace city_balance_total="seattle" if state=="washington" & county=="king" > & births__total==1370 (1 real change made) . replace city_balance_total="bluefield" if state=="west virginia" & county==" > mercer" & births__total==338 (1 real change made) . replace city_balance_total="bluefield" if state=="west virginia" & county==" > mercer" & births__total==80 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="total" & births__total==4834 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="white" & births__total==2276 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="nonwhite" & births__total==2558 (1 real change made) . replace county="roanoke (ind. city)" if county=="roanoke" & state=="virginia > " & race=="total" & births__total==1794 (1 real change made) . replace race="white" if state=="tennessee" & county=="madison" & city_balanc > e_total=="balance" & race=="total" & births__total==298 (1 real change made) . replace births__total=684 if state=="tennessee" & county=="maury" & city_bal > ance_total=="total" & race=="white" (0 real changes made) . replace births_of_residents_of_area__att=670 if state=="tennessee" & county= > ="maury" & city_balance_total=="total" & race=="white" (0 real changes made) . replace births_of_residents_of_area__at0=14 if state=="tennessee" & county== > "maury" & city_balance_total=="total" & race=="white" (0 real changes made) . replace city_balance_total="balance of county" if state=="virginia" & county > =="fairfax" & births__total==6454 (1 real change made) . replace county="franklin (ind. city)" if county=="franklin" & state=="virgin > ia" & race=="total" & births__total==164 (1 real change made) . replace county="franklin (ind. city)" if county=="franklin" & state=="virgin > ia" & race=="white" & births__total==54 (1 real change made) . replace county="franklin (ind. city)" if county=="franklin" & state=="virgin > ia" & race=="nonwhite" & births__total==110 (1 real change made) . replace county="meigs" if state=="ohio" & county=="medina" & births__total== > 344 (1 real change made) . replace county="mercer" if state=="ohio" & county=="meigs" & births__total== > 780 (1 real change made) . replace county="miami" if state=="ohio" & county=="mercer" & births__total== > 1560 (1 real change made) . replace county="monroe" if state=="ohio" & county=="miami" & births__total== > 324 (1 real change made) . replace county="montgomery" if state=="ohio" & county=="monroe" & births__to > tal==11942 (1 real change made) . drop if state=="virginia" & county=="fairfax" & births__total==370 (1 observation deleted) . replace county="morgan" if state=="ohio" & county=="montgomery" & births__to > tal==228 (1 real change made) . replace race="nonwhite" if state=="tennessee" & county=="madison" & births__ > total==304 (1 real change made) . replace city_balance_total="balance" if state=="virginia" & county=="fairfax > " & births__total==6454 (1 real change made) . replace county="morrow" if state=="ohio" & county=="morgan" & births__total= > =390 (1 real change made) . replace county="muskingum" if state=="ohio" & county=="morrow" & births__tot > al==1664 (1 real change made) . replace county="noble" if state=="ohio" & county=="muskingum" & births__tota > l==174 (1 real change made) . replace county="ottawa" if state=="ohio" & county=="noble" & births__total== > 666 (1 real change made) . replace county="paulding" if state=="ohio" & county=="ottawa" & births__tota > l==378 (1 real change made) . replace county="perry" if state=="ohio" & county=="paulding" & births__total > ==546 (1 real change made) . replace county="pickaway" if state=="ohio" & county=="perry" & births__total > ==798 (1 real change made) . replace county="pike" if state=="ohio" & county=="pickaway" & births__total= > =404 (1 real change made) . replace county="portage" if state=="ohio" & county=="pike" & births__total== > 2100 (1 real change made) . replace county="preble" if state=="ohio" & county=="portage" & births__total > ==660 (1 real change made) . replace county="putnam" if state=="ohio" & county=="preble" & births__total= > =766 (1 real change made) . replace county="richland" if state=="ohio" & county=="putnam" & births__tota > l==2636 (1 real change made) . replace county="ross" if state=="ohio" & county=="richland" & births__total= > =1332 (1 real change made) . replace county="sandusky" if state=="ohio" & county=="ross" & births__total= > =1240 (1 real change made) . replace county="scioto" if state=="ohio" & county=="sandusky" & births__tota > l==1618 (1 real change made) . replace county="seneca" if state=="ohio" & county=="scioto" & births__total= > =1362 (1 real change made) . replace county="shelby" if state=="ohio" & county=="seneca" & births__total= > =810 (1 real change made) . replace county="stark" if state=="ohio" & county=="shelby" & births__total== > 6844 (1 real change made) . replace county="summit" if state=="ohio" & county=="stark" & births__total== > 10890 (1 real change made) . . *correct data entry errors found while checking that county totals sum to st > ate totals . replace race="total" if state=="tennessee" & county=="mcminn" (2 real changes made) . replace city_balance_total="total" if state=="tennessee" & county=="mcminn" > & births__total==712 (1 real change made) . replace city_balance_total="athens" if state=="tennessee" & county=="mcminn" > & births__total==250 (1 real change made) . replace city_balance_total="balance" if state=="tennessee" & county=="mcminn > " & births__total==462 (0 real changes made) . replace race="total" if state=="tennessee" & county=="mcnairy" (1 real change made) . replace city_balance_total="total" if state=="tennessee" & county=="mcnairy" (1 real change made) . replace city_balance_total="total" if state=="tennessee" & county=="madison" > & births__total==1336 (1 real change made) . replace race="white" if state=="tennessee" & county=="madison" & births__tot > al==712 (1 real change made) . replace city_balance_total="total" if state=="tennessee" & county=="madison" > & births__total==712 (1 real change made) . replace race="nonwhite" if state=="tennessee" & county=="madison" & births__ > total==624 (1 real change made) . replace city_balance_total="total" if state=="tennessee" & county=="madison" > & births__total==624 (0 real changes made) . replace race="nonwhite" if state=="tennessee" & county=="madison" & births__ > total==320 (1 real change made) . replace city_balance_total="jackson" if state=="tennessee" & county=="madiso > n" & births__total==320 (1 real change made) . replace race="total" if state=="tennessee" & county=="madison" & births__tot > al==602 (1 real change made) . replace city_balance_total="balance" if state=="tennessee" & county=="madiso > n" & births__total==602 (1 real change made) . replace race="white" if state=="tennessee" & county=="madison" & births__tot > al==298 (0 real changes made) . replace race="nonwhite" if state=="tennessee" & county=="madison" & births__ > total==304 (0 real changes made) . replace race="total" if state=="tennessee" & county=="marion" (1 real change made) . replace city_balance_total="total" if state=="tennessee" & county=="marion" (1 real change made) . replace city_balance_total="total" if state=="tennessee" & county=="marshall > " (3 real changes made) . . . . *check that county names are consistent with 1970 census, except for known d > eviations . preserve . egen countyr_state=concat(county state), punct("_") . sort countyr_state . merge countyr_state using original_census1970_counties variable countyr_state does not uniquely identify observations in the master data variable countyr_state does not uniquely identify observations in original_census1970_counties.dta . assert _m==3|county=="armstrong" & state=="south dakota"|county=="washington > " & state=="south dakota"|county=="yellowstone national park (part)" & state > =="idaho"|county=="yellowstone national park, part" & state=="idaho"|county= > ="yellowstone national park" & state=="idaho"|county=="yellowstone national > park (part)" & state=="montana"|county=="yellowstone national park (part)" & > state=="wyoming"|county=="yellowstone national park (total)" & state=="wyom > ing"|county=="yellowstone national park," & state=="wyoming" |county=="yello > wstone national park" & state=="montana"|county=="park (excl yell nat park)" > & state=="montana"|county=="yellowstone national park, part" & state=="mont > ana"|county=="yellowstone national park, part" & state=="wyoming"|county=="y > ellowstone national park, part" & state=="montana"|county=="yellowstone nati > onal park, total" & state=="wyoming"|county=="park" & state=="montana"|count > y=="yellowstone nat. park (part)" & state=="idaho"|county=="yellowstone nat. > park (part)" & state=="wyoming"|county=="yellowstone nat. park (total)" & s > tate=="wyoming"|county=="new york city" & state=="new york"|county=="bronx" > & state=="new york"|county=="kings" & state=="new york"|county=="new york" & > state=="new york"|county=="queens" & state=="new york"|county=="richmond" & > state=="new york"|county=="ormsby" & state=="nevada"|county=="carson city" > & state=="nevada"|county=="los alamos" & state=="new mexico"|county=="menomi > nee" & state=="wisconsin"|county=="total"|state=="alaska"|state=="hawaii"|st > ate=="virginia"|state=="dc" . restore . . *check for mistakes/misspellings in state names . assert state=="alabama"|state=="alaska"|state=="arizona"|state=="arkansas"|s > tate=="california"|state=="colorado"|state=="connecticut"|state=="delaware"| > state=="district of columbia"|state=="florida"|state=="georgia"|state=="hawa > ii"|state=="idaho"|state=="illinois"|state=="indiana"|state=="iowa"|state==" > kansas"|state=="kentucky"|state=="louisiana"|state=="maine"|state=="maryland > "|state=="massachusetts"|state=="michigan"|state=="minnesota"|state=="missis > sippi"|state=="missouri"|state=="montana"|state=="nebraska"|state=="nevada"| > state=="new hampshire"|state=="new jersey"|state=="new mexico"|state=="new y > ork"|state=="north carolina"|state=="north dakota"|state=="ohio"|state=="okl > ahoma"|state=="oregon"|state=="pennsylvania"|state=="rhode island"|state=="s > outh carolina"|state=="south dakota"|state=="tennessee"|state=="texas"|state > =="utah"|state=="vermont"|state=="virginia"|state=="washington"|state=="west > virginia"|state=="wisconsin"|state=="wyoming" . . *data checks for columns summing to county total . gen temp=births_of_residents_of_area__att+ births_of_residents_of_area__at0 . assert temp==births__total . drop temp . . *clean and label variables . rename page__of_pdf_ page_of_pdf_ . label var page_of_pdf_ "page of pdf" . label var state "state" . label var county "county" . rename city_balance_total sub_county . label var sub_county "city/balance/total" . label var race "race" . rename births__total births . label var births "births by residence" . rename births_of_residents_of_area__att births_h_p . label var births_h_p "births by residence: physician in hospital" . rename births_of_residents_of_area__at0 births_nh_ns . label var births_nh_ns "births by residence: attendant not in hospital and n > ot specified" . . *generate year variable . gen year=1964 . label var year "year" . . *check that observations are unique . egen tag=tag(state county sub_county race) . assert tag==1 . drop tag . . list in 1/1 +--------------------------------------------------------------------+ 1. | page_o~_ | state | county | sub_co~y | race | births | births~p | | 2-3 | alabama | total | total | total | 76316 | 65092 | |--------------------------------------------------------------------| | births~s | year | | 11224 | 1964 | +--------------------------------------------------------------------+ . desc Contains data from natality1964.dta obs: 9,312 vars: 9 size: 1,136,064 (89.2% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page_of_pdf_ str4 %4s page of pdf state str20 %20s state county str45 %45s county sub_county str27 %27s city/balance/total race str8 %8s race births long %12.0g births by residence births_h_p long %12.0g births by residence: physician in hospital births_nh_ns int %8.0g births by residence: attendant not in hospital and not specified year float %9.0g year ------------------------------------------------------------------------------ > - Sorted by: Note: dataset has changed since last saved . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page_of_pdf_ | 0 state | 0 county | 0 sub_county | 0 race | 0 -------------+-------------------------------------------------------- births | 9312 2084.55 11869.81 0 374972 births_h_p | 9312 2032.397 11687 0 371800 births_nh_ns | 9312 52.15314 390.7409 0 14066 year | 9312 1964 0 1964 1964 . saveold clean_natality1964.dta,replace file clean_natality1964.dta saved . clear . . ** . *1965 data . ** . *http://nber15.nber.org/vital-stats-books/nat65_1.cv.pdf . *table 2-1 . . *hand-checked excel file for ?'s (difficult-to-read characters) . *corrected below if any . . clear . use natality1965.dta . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page__of_p~_ | 0 state | 0 county | 0 city_balan~l | 0 race | 0 -------------+-------------------------------------------------------- births_by_~o | 9297 1952.324 11152.79 2 355592 births_by_~t | 9297 1901.296 10971.5 2 352168 births_by_~0 | 6653 71.30618 428.4368 2 12262 . desc Contains data from natality1965.dta obs: 9,313 vars: 8 size: 1,098,934 (89.5% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page__of_pdf_ str4 %4s state str20 %20s county str45 %45s city_balance_~l str27 %27s race str8 %8s births_by_pla~o long %12.0g births_by_pla~t long %12.0g births_by_pla~0 int %8.0g ------------------------------------------------------------------------------ > - Sorted by: . replace county=lower(county) (9313 real changes made) . replace state=lower(state) (9313 real changes made) . replace city_balance_total=lower(city_balance_total) (9313 real changes made) . replace race=lower(race) (9313 real changes made) . . *missing observations in the original data are zeros . *replace these so we can distinguish these from "true" missing values . *which would be generated below due to difficult-to-read characters . count if births_by_place_of_residence__to==. 16 . replace births_by_place_of_residence__to=0 if births_by_place_of_residence__ > to ==. (16 real changes made) . count if births_by_place_of_residence__at==. 16 . replace births_by_place_of_residence__at =0 if births_by_place_of_residence_ > _at ==. (16 real changes made) . count if births_by_place_of_residence__a0==. 2660 . replace births_by_place_of_residence__a0 =0 if births_by_place_of_residence_ > _a0 ==. (2660 real changes made) . . *check that all pdf pages appear to be in the data . gen temp=strpos(page__of_pdf_,"-") . gen newpagenumber=substr(page__of_pdf_,temp+1,.) . destring newpagenumber, replace newpagenumber has all characters numeric; replaced as byte . drop temp . sort newpagenumber . gen temp=newpagenumber[_n]-newpagenumber[_n-1] (1 missing value generated) . assert temp==0|temp==1|temp==. . drop temp newpagenumber . . *clean data entry errors . replace county="fairfax (ind. city)" if county=="fairfax" & state=="virginia > " & race=="total" & births_by_place_of_residence__to==376 (1 real change made) . replace county="franklin (ind. city)" if county=="franklin" & state=="virgin > ia" & race=="total" & births_by_place_of_residence__to==148 (1 real change made) . replace county="franklin (ind. city)" if county=="franklin" & state=="virgin > ia" & births_by_place_of_residence__to==66 (1 real change made) . replace county="franklin (ind. city)" if county=="franklin" & state=="virgin > ia" & births_by_place_of_residence__to==82 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="total" & births_by_place_of_residence__to==4218 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="white" & births_by_place_of_residence__to==1860 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="nonwhite" & births_by_place_of_residence__to==2358 (1 real change made) . replace county="roanoke (ind. city)" if county=="roanoke" & state=="virginia > " & race=="total" & births_by_place_of_residence__to==1680 (1 real change made) . . *fix county name misspellings (relative to county names in the 1970 census) . replace county=subinstr(county,"st.","st",.) (206 real changes made) . replace county=subinstr(county,"ste.","st",.) (1 real change made) . replace county="cochrane" if county=="cochran" & state=="texas" (1 real change made) . replace county="davidson" if county=="davidson, coex. with nashville city" & > state=="tennessee" (3 real changes made) . replace county="desoto" if county=="de soto" & state=="florida" (3 real changes made) . replace county="denver" if county=="denver, coex. with denver city" & state= > ="colorado" (3 real changes made) . replace county="mcclain" if state=="oklahoma" & county=="mc clain" (1 real change made) . replace county="mccone" if state=="montana" & county=="mc cone" (1 real change made) . replace county="mccook" if state=="south dakota" & county=="mc cook" (1 real change made) . replace county="mccormick" if state=="south carolina" & county=="mc cormick" (3 real changes made) . replace county="mccracken" if state=="kentucky" & county=="mc cracken" (7 real changes made) . replace county="mccreary" if state=="kentucky" & county=="mc creary" (1 real change made) . replace county="mcculloch" if state=="texas" & county=="mc culloch" (1 real change made) . replace county="mccurtain" if state=="oklahoma" & county=="mc curtain" (3 real changes made) . replace county="mcdonald" if state=="missouri" & county=="mc donald" (1 real change made) . replace county="mcdonough" if state=="illinois" & county=="mc donough" (3 real changes made) . replace county="mcdowell" if state=="north carolina" & county=="mc dowell" (1 real change made) . replace county="mcdowell" if state=="west virginia" & county=="mc dowell" (3 real changes made) . replace county="mcduffie" if state=="georgia" & county=="mc duffie" (3 real changes made) . replace county="mchenry" if state=="illinois" & county=="mc henry" (1 real change made) . replace county="mchenry" if state=="north dakota" & county=="mc henry" (1 real change made) . replace county="mcintosh" if state=="georgia" & county=="mc intosh" (3 real changes made) . replace county="mcintosh" if state=="north dakota" & county=="mc intosh" (1 real change made) . replace county="mcintosh" if state=="oklahoma" & county=="mc intosh" (3 real changes made) . replace county="mckean" if state=="pennsylvania" & county=="mc kean" (3 real changes made) . replace county="mckenzie" if state=="north dakota" & county=="mc kenzie" (1 real change made) . replace county="mckinley" if state=="new mexico" & county=="mc kinley" (9 real changes made) . replace county="mclean" if state=="illinois" & county=="mc lean" (4 real changes made) . replace county="mclean" if state=="kentucky" & county=="mc lean" (1 real change made) . replace county="mclean" if state=="north dakota" & county=="mc lean" (1 real change made) . replace county="mclennan" if state=="texas" & county=="mc lennan" (9 real changes made) . replace county="mcleod" if state=="minnesota" & county=="mc leod" (1 real change made) . replace county="mcminn" if state=="tennessee" & county=="mc minn" (3 real changes made) . replace county="mcmullen" if state=="texas" & county=="mc mullen" (1 real change made) . replace county="mcnairy" if state=="tennessee" & county=="mc nairy" (1 real change made) . replace county="mcpherson" if state=="kansas" & county=="mc pherson" (1 real change made) . replace county="mcpherson" if state=="nebraska" & county=="mc pherson" (1 real change made) . replace county="mcpherson" if state=="south dakota" & county=="mc pherson" (1 real change made) . replace county="o'brien" if state=="iowa" & county=="o brien" (1 real change made) . replace county="orleans" if county=="orleans, coex. with new orleans city" & > state=="louisiana" (3 real changes made) . replace county="philadelphia" if county=="philadelphia, coex. with philaoelp > hia city" & state=="pennsylvania" (3 real changes made) . replace county="san francisco" if county=="san francisco, coex. with san fra > ncisco city." & state=="california" (3 real changes made) . . *check that county names are consistent with 1970 census, except for known d > eviations . preserve . egen countyr_state=concat(county state), punct("_") . sort countyr_state . merge countyr_state using original_census1970_counties variable countyr_state does not uniquely identify observations in the master data variable countyr_state does not uniquely identify observations in original_census1970_counties.dta . assert _m==3|county=="armstrong" & state=="south dakota"|county=="washington > " & state=="south dakota"|county=="yellowstone national park (part)" & state > =="idaho"|county=="yellowstone national park, part" & state=="idaho"|county= > ="yellowstone national park" & state=="idaho"|county=="yellowstone national > park (part)" & state=="montana"|county=="yellowstone national park (part)" & > state=="wyoming"|county=="yellowstone national park (total)" & state=="wyom > ing"|county=="yellowstone national park," & state=="wyoming" |county=="yello > wstone national park" & state=="montana"|county=="park (excl yell nat park)" > & state=="montana"|county=="yellowstone national park, part" & state=="mont > ana"|county=="yellowstone national park, part" & state=="wyoming"|county=="y > ellowstone national park, part" & state=="montana"|county=="yellowstone nati > onal park, total" & state=="wyoming"|county=="park" & state=="montana"|count > y=="yellowstone nat. park (part)" & state=="idaho"|county=="yellowstone nat. > park (part)" & state=="wyoming"|county=="yellowstone nat. park (total)" & s > tate=="wyoming"|county=="new york city" & state=="new york"|county=="bronx" > & state=="new york"|county=="kings" & state=="new york"|county=="new york" & > state=="new york"|county=="queens" & state=="new york"|county=="richmond" & > state=="new york"|county=="ormsby" & state=="nevada"|county=="carson city" > & state=="nevada"|county=="los alamos" & state=="new mexico"|county=="menomi > nee" & state=="wisconsin"|county=="total"|state=="alaska"|state=="hawaii"|st > ate=="virginia"|state=="dc" . restore . . *check for mistakes/misspellings in state names . assert state=="alabama"|state=="alaska"|state=="arizona"|state=="arkansas"|s > tate=="california"|state=="colorado"|state=="connecticut"|state=="delaware"| > state=="district of columbia"|state=="florida"|state=="georgia"|state=="hawa > ii"|state=="idaho"|state=="illinois"|state=="indiana"|state=="iowa"|state==" > kansas"|state=="kentucky"|state=="louisiana"|state=="maine"|state=="maryland > "|state=="massachusetts"|state=="michigan"|state=="minnesota"|state=="missis > sippi"|state=="missouri"|state=="montana"|state=="nebraska"|state=="nevada"| > state=="new hampshire"|state=="new jersey"|state=="new mexico"|state=="new y > ork"|state=="north carolina"|state=="north dakota"|state=="ohio"|state=="okl > ahoma"|state=="oregon"|state=="pennsylvania"|state=="rhode island"|state=="s > outh carolina"|state=="south dakota"|state=="tennessee"|state=="texas"|state > =="utah"|state=="vermont"|state=="virginia"|state=="washington"|state=="west > virginia"|state=="wisconsin"|state=="wyoming" . . *data checks for columns summing to county total . gen temp=births_by_place_of_residence__at+births_by_place_of_residence__a0 . assert temp==births_by_place_of_residence__to . drop temp . . *clean and label variables . rename page__of_pdf_ page_of_pdf_ . label var page_of_pdf_ "page of pdf" . label var state "state" . label var county "county" . rename city_balance_total sub_county . label var sub_county "city/balance/total" . label var race "race" . rename births_by_place_of_residence__to births . label var births "births by residence" . rename births_by_place_of_residence__at births_h . label var births_h "births by residence: in hospital" . rename births_by_place_of_residence__a0 births_nh_ns . label var births_nh_ns "births by residence: not in hospital and not specifi > ed" . . *generate year variable . gen year=1965 . label var year "year" . . *check that observations are unique . egen tag=tag(state county sub_county race) . assert tag==1 . drop tag . . list in 1/1 +--------------------------------------------------------------------+ 1. | page_o~_ | state | county | sub_co~y | race | births | births_h | | 2-3 | alabama | total | total | total | 70642 | 61054 | |--------------------------------------------------------------------| | births~s | year | | 9588 | 1965 | +--------------------------------------------------------------------+ . desc Contains data from natality1965.dta obs: 9,313 vars: 9 size: 1,136,186 (89.2% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page_of_pdf_ str4 %4s page of pdf state str20 %20s state county str45 %45s county sub_county str27 %27s city/balance/total race str8 %8s race births long %12.0g births by residence births_h long %12.0g births by residence: in hospital births_nh_ns int %8.0g births by residence: not in hospital and not specified year float %9.0g year ------------------------------------------------------------------------------ > - Sorted by: Note: dataset has changed since last saved . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page_of_pdf_ | 0 state | 0 county | 0 sub_county | 0 race | 0 -------------+-------------------------------------------------------- births | 9313 1948.969 11143.5 0 355592 births_h | 9313 1898.03 10962.35 0 352168 births_nh_ns | 9313 50.93955 363.5406 0 12262 year | 9313 1965 0 1965 1965 . saveold clean_natality1965.dta,replace file clean_natality1965.dta saved . clear . . ** . *1966 data . ** . *http://nber15.nber.org/vital-stats-books/nat66_1.cv.pdf . *table 2-1 . . *hand-checked excel file for ?'s (difficult-to-read characters) . *corrected below if any . . clear . use natality1966.dta . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page__of_p~_ | 0 state | 0 county | 0 city_balan~l | 0 race | 0 -------------+-------------------------------------------------------- births_by_~o | 9294 1873.512 10707.46 2 338184 births_by_~t | 9293 1837.227 10587.27 2 336816 births_by_~0 | 5959 56.89982 360.2774 2 10280 . desc Contains data from natality1966.dta obs: 9,313 vars: 8 size: 1,098,934 (89.5% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page__of_pdf_ str4 %4s state str20 %20s county str45 %45s city_balance_~l str27 %27s race str8 %8s births_by_pla~o long %12.0g births_by_pla~t long %12.0g births_by_pla~0 int %8.0g ------------------------------------------------------------------------------ > - Sorted by: . replace county=lower(county) (9313 real changes made) . replace state=lower(state) (9313 real changes made) . replace city_balance_total=lower(city_balance_total) (9313 real changes made) . replace race=lower(race) (9313 real changes made) . . *missing observations in the original data are zeros . *replace these so we can distinguish these from "true" missing values . *which would be generated below due to difficult-to-read characters . count if births_by_place_of_residence__to==. 19 . replace births_by_place_of_residence__to=0 if births_by_place_of_residence > __to ==. (19 real changes made) . count if births_by_place_of_residence__at==. 20 . replace births_by_place_of_residence__at =0 if births_by_place_of_residenc > e__at ==. (20 real changes made) . count if births_by_place_of_residence__a0==. 3354 . replace births_by_place_of_residence__a0 =0 if births_by_place_of_residenc > e__a0 ==. (3354 real changes made) . . *check that all pdf pages appear to be in the data . gen temp=strpos(page__of_pdf_,"-") . gen newpagenumber=substr(page__of_pdf_,temp+1,.) . destring newpagenumber, replace newpagenumber has all characters numeric; replaced as byte . drop temp . sort newpagenumber . gen temp=newpagenumber[_n]-newpagenumber[_n-1] (1 missing value generated) . assert temp==0|temp==1|temp==. . drop temp newpagenumber . . *clean data entry errors . replace births_by_place_of_residence__at=2 if state=="montana" & county=="y > ellowstone national park, part" (1 real change made) . replace county="fairfax (ind. city)" if county=="fairfax" & state=="virginia > " & race=="total" & births_by_place_of_residence__to==464 (1 real change made) . replace county="franklin (ind. city)" if county=="franklin" & state=="virgin > ia" & race=="total" & births_by_place_of_residence__to==124 (1 real change made) . replace county="franklin (ind. city)" if county=="franklin" & state=="virgin > ia" & births_by_place_of_residence__to==32 (1 real change made) . replace county="franklin (ind. city)" if county=="franklin" & state=="virgin > ia" & births_by_place_of_residence__to==92 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="total" & births_by_place_of_residence__to==4070 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="white" & births_by_place_of_residence__to==1818 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="nonwhite" & births_by_place_of_residence__to==2252 (1 real change made) . replace county="roanoke (ind. city)" if county=="roanoke" & state=="virginia > " & race=="total" & births_by_place_of_residence__to==1682 (1 real change made) . . *fix county name misspellings (relative to county names in the 1970 census) . replace county=subinstr(county,"st.","st",.) (206 real changes made) . replace county=subinstr(county,"ste.","st",.) (1 real change made) . replace county="cochrane" if county=="cochran" & state=="texas" (1 real change made) . replace county="davidson" if county=="davidson, coex. with nashville city" & > state=="tennessee" (3 real changes made) . replace county="desoto" if county=="de soto" & state=="florida" (3 real changes made) . replace county="denver" if county=="denver, coex. with denver city" & state= > ="colorado" (3 real changes made) . replace county="mcclain" if state=="oklahoma" & county=="mc clain" (1 real change made) . replace county="mccone" if state=="montana" & county=="mc cone" (1 real change made) . replace county="mccook" if state=="south dakota" & county=="mc cook" (1 real change made) . replace county="mccormick" if state=="south carolina" & county=="mc cormick" (3 real changes made) . replace county="mccracken" if state=="kentucky" & county=="mc cracken" (7 real changes made) . replace county="mccreary" if state=="kentucky" & county=="mc creary" (1 real change made) . replace county="mcculloch" if state=="texas" & county=="mc culloch" (1 real change made) . replace county="mccurtain" if state=="oklahoma" & county=="mc curtain" (3 real changes made) . replace county="mcdonald" if state=="missouri" & county=="mc donald" (1 real change made) . replace county="mcdonough" if state=="illinois" & county=="mc donough" (3 real changes made) . replace county="mcdowell" if state=="north carolina" & county=="mc dowell" (1 real change made) . replace county="mcdowell" if state=="west virginia" & county=="mc dowell" (3 real changes made) . replace county="mcduffie" if state=="georgia" & county=="mc duffie" (3 real changes made) . replace county="mchenry" if state=="illinois" & county=="mc henry" (1 real change made) . replace county="mchenry" if state=="north dakota" & county=="mc henry" (1 real change made) . replace county="mcintosh" if state=="georgia" & county=="mc intosh" (3 real changes made) . replace county="mcintosh" if state=="north dakota" & county=="mc intosh" (1 real change made) . replace county="mcintosh" if state=="oklahoma" & county=="mc intosh" (3 real changes made) . replace county="mckean" if state=="pennsylvania" & county=="mc kean" (3 real changes made) . replace county="mckenzie" if state=="north dakota" & county=="mc kenzie" (1 real change made) . replace county="mckinley" if state=="new mexico" & county=="mc kinley" (9 real changes made) . replace county="mclean" if state=="illinois" & county=="mc lean" (4 real changes made) . replace county="mclean" if state=="kentucky" & county=="mc lean" (1 real change made) . replace county="mclean" if state=="north dakota" & county=="mc lean" (1 real change made) . replace county="mclennan" if state=="texas" & county=="mc lennan" (9 real changes made) . replace county="mcleod" if state=="minnesota" & county=="mc leod" (1 real change made) . replace county="mcminn" if state=="tennessee" & county=="mc minn" (3 real changes made) . replace county="mcmullen" if state=="texas" & county=="mc mullen" (1 real change made) . replace county="mcnairy" if state=="tennessee" & county=="mc nairy" (1 real change made) . replace county="mcpherson" if state=="kansas" & county=="mc pherson" (1 real change made) . replace county="mcpherson" if state=="nebraska" & county=="mc pherson" (1 real change made) . replace county="mcpherson" if state=="south dakota" & county=="mc pherson" (1 real change made) . replace county="o'brien" if state=="iowa" & county=="o brien" (1 real change made) . replace county="orleans" if county=="orleans, coex. with new orleans city" & > state=="louisiana" (3 real changes made) . replace county="philadelphia" if county=="philadelphia, coex. with philaoelp > hia city" & state=="pennsylvania" (3 real changes made) . replace county="san francisco" if county=="san francisco, coex. with san fra > ncisco city." & state=="california" (3 real changes made) . replace city_balance_total="balance of county" if state=="wisconsin" & count > y=="jefferson" & births_by_place_of_residence__to==780 (1 real change made) . . *check that county names are consistent with 1970 census, except for known d > eviations . preserve . egen countyr_state=concat(county state), punct("_") . sort countyr_state . merge countyr_state using original_census1970_counties variable countyr_state does not uniquely identify observations in the master data variable countyr_state does not uniquely identify observations in original_census1970_counties.dta . assert _m==3|county=="armstrong" & state=="south dakota"|county=="washington > " & state=="south dakota"|county=="yellowstone national park (part)" & state > =="idaho"|county=="yellowstone national park, part" & state=="idaho"|county= > ="yellowstone national park" & state=="idaho"|county=="yellowstone national > park (part)" & state=="montana"|county=="yellowstone national park (part)" & > state=="wyoming"|county=="yellowstone national park (total)" & state=="wyom > ing"|county=="yellowstone national park," & state=="wyoming" |county=="yello > wstone national park" & state=="montana"|county=="park (excl yell nat park)" > & state=="montana"|county=="yellowstone national park, part" & state=="mont > ana"|county=="yellowstone national park, part" & state=="wyoming"|county=="y > ellowstone national park, part" & state=="montana"|county=="yellowstone nati > onal park, total" & state=="wyoming"|county=="park" & state=="montana"|count > y=="yellowstone nat. park (part)" & state=="idaho"|county=="yellowstone nat. > park (part)" & state=="wyoming"|county=="yellowstone nat. park (total)" & s > tate=="wyoming"|county=="new york city" & state=="new york"|county=="bronx" > & state=="new york"|county=="kings" & state=="new york"|county=="new york" & > state=="new york"|county=="queens" & state=="new york"|county=="richmond" & > state=="new york"|county=="ormsby" & state=="nevada"|county=="carson city" > & state=="nevada"|county=="los alamos" & state=="new mexico"|county=="menomi > nee" & state=="wisconsin"|county=="total"|state=="alaska"|state=="hawaii"|st > ate=="virginia"|state=="dc" . restore . . *check for mistakes/misspellings in state names . assert state=="alabama"|state=="alaska"|state=="arizona"|state=="arkansas"|s > tate=="california"|state=="colorado"|state=="connecticut"|state=="delaware"| > state=="district of columbia"|state=="florida"|state=="georgia"|state=="hawa > ii"|state=="idaho"|state=="illinois"|state=="indiana"|state=="iowa"|state==" > kansas"|state=="kentucky"|state=="louisiana"|state=="maine"|state=="maryland > "|state=="massachusetts"|state=="michigan"|state=="minnesota"|state=="missis > sippi"|state=="missouri"|state=="montana"|state=="nebraska"|state=="nevada"| > state=="new hampshire"|state=="new jersey"|state=="new mexico"|state=="new y > ork"|state=="north carolina"|state=="north dakota"|state=="ohio"|state=="okl > ahoma"|state=="oregon"|state=="pennsylvania"|state=="rhode island"|state=="s > outh carolina"|state=="south dakota"|state=="tennessee"|state=="texas"|state > =="utah"|state=="vermont"|state=="virginia"|state=="washington"|state=="west > virginia"|state=="wisconsin"|state=="wyoming" . . *data checks for columns summing to county total . gen temp=births_by_place_of_residence__at+births_by_place_of_residence__a0 . assert temp==births_by_place_of_residence__to . drop temp . . *clean and label variables . rename page__of_pdf_ page_of_pdf_ . label var page_of_pdf_ "page of pdf" . label var state "state" . label var county "county" . rename city_balance_total sub_county . label var sub_county "city/balance/total" . label var race "race" . rename births_by_place_of_residence__to births . label var births "births by residence" . rename births_by_place_of_residence__at births_h_p . label var births_h_p "births by residence: in hospital" . rename births_by_place_of_residence__a0 births_nh_ns . label var births_nh_ns "births by residence: not in hospital and not specifi > ed" . . *generate year variable . gen year=1966 . label var year "year" . . *check that observations are unique . egen tag=tag(state county sub_county race) . assert tag==1 . drop tag . . list in 1/1 +--------------------------------------------------------------------+ 1. | page_o~_ | state | county | sub_co~y | race | births | births~p | | 2-3 | alabama | total | total | total | 66498 | 58566 | |--------------------------------------------------------------------| | births~s | year | | 7932 | 1966 | +--------------------------------------------------------------------+ . desc Contains data from natality1966.dta obs: 9,313 vars: 9 size: 1,136,186 (89.2% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page_of_pdf_ str4 %4s page of pdf state str20 %20s state county str45 %45s county sub_county str27 %27s city/balance/total race str8 %8s race births long %12.0g births by residence births_h_p long %12.0g births by residence: in hospital births_nh_ns int %8.0g births by residence: not in hospital and not specified year float %9.0g year ------------------------------------------------------------------------------ > - Sorted by: Note: dataset has changed since last saved . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page_of_pdf_ | 0 state | 0 county | 0 sub_county | 0 race | 0 -------------+-------------------------------------------------------- births | 9313 1869.689 10696.87 0 338184 births_h_p | 9313 1833.281 10576.24 0 336816 births_nh_ns | 9313 36.40782 289.473 0 10280 year | 9313 1966 0 1966 1966 . saveold clean_natality1966.dta,replace file clean_natality1966.dta saved . clear . . ** . *1967 data . ** . *http://nber15.nber.org/vital-stats-books/nat67_1.cv.pdf . *table 2-1 . . *hand-checked excel file for ?'s (difficult-to-read characters) . *corrected below if any . . clear . use natality1967.dta . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page__of_p~_ | 0 state | 0 county | 0 city_balan~l | 0 race | 0 -------------+-------------------------------------------------------- births_by_~o | 9295 1831.259 10457.9 2 337159 births_by_~t | 9293 1800.516 10344.41 2 335090 births_by_~0 | 5209 55.66558 337.1709 2 9578 . desc Contains data from natality1967.dta obs: 9,313 vars: 8 size: 1,098,934 (89.5% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page__of_pdf_ str4 %4s state str20 %20s county str45 %45s city_balance_~l str27 %27s race str8 %8s births_by_pla~o long %12.0g births_by_pla~t long %12.0g births_by_pla~0 int %8.0g ------------------------------------------------------------------------------ > - Sorted by: . replace county=lower(county) (9313 real changes made) . replace state=lower(state) (9313 real changes made) . replace city_balance_total=lower(city_balance_total) (9313 real changes made) . replace race=lower(race) (9313 real changes made) . . *missing observations in the original data are zeros . *replace these so we can distinguish these from "true" missing values . *which would be generated below due to difficult-to-read characters . count if births_by_place_of_residence__to==. 18 . replace births_by_place_of_residence__to=0 if births_by_place_of_residence__ > to ==. (18 real changes made) . count if births_by_place_of_residence__at==. 20 . replace births_by_place_of_residence__at =0 if births_by_place_of_residence_ > _at ==. (20 real changes made) . count if births_by_place_of_residence__a0==. 4104 . replace births_by_place_of_residence__a0 =0 if births_by_place_of_residence_ > _a0 ==. (4104 real changes made) . . *check that all pdf pages appear to be in the data . gen temp=strpos(page__of_pdf_,"-") . gen newpagenumber=substr(page__of_pdf_,temp+1,.) . destring newpagenumber, replace newpagenumber has all characters numeric; replaced as byte . drop temp . sort newpagenumber . gen temp=newpagenumber[_n]-newpagenumber[_n-1] (1 missing value generated) . assert temp==0|temp==1|temp==. . drop temp newpagenumber . . *clean data entry errors . replace county="fairfax (ind. city)" if county=="fairfax" & state=="virginia > " & race=="total" & births_by_place_of_residence__to==462 (1 real change made) . replace county="franklin (ind. city)" if county=="franklin" & state=="virgin > ia" & race=="total" & births_by_place_of_residence__to==139 (1 real change made) . replace county="franklin (ind. city)" if county=="franklin" & state=="virgin > ia" & births_by_place_of_residence__to==71 (1 real change made) . replace county="franklin (ind. city)" if county=="franklin" & state=="virgin > ia" & births_by_place_of_residence__to==68 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="total" & births_by_place_of_residence__to==3923 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="white" & births_by_place_of_residence__to==1761 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="nonwhite" & births_by_place_of_residence__to==2162 (1 real change made) . replace county="roanoke (ind. city)" if county=="roanoke" & state=="virginia > " & race=="total" & births_by_place_of_residence__to==1630 (1 real change made) . . *fix county name misspellings (relative to county names in the 1970 census) . replace county=subinstr(county,"st.","st",.) (206 real changes made) . replace county=subinstr(county,"ste.","st",.) (1 real change made) . replace county="cochrane" if county=="cochran" & state=="texas" (1 real change made) . replace county="davidson" if county=="davidson, coex. with nashville city" & > state=="tennessee" (3 real changes made) . replace county="desoto" if county=="de soto" & state=="florida" (3 real changes made) . replace county="denver" if county=="denver, coex. with denver city" & state= > ="colorado" (3 real changes made) . replace county="mcclain" if state=="oklahoma" & county=="mc clain" (1 real change made) . replace county="mccone" if state=="montana" & county=="mc cone" (1 real change made) . replace county="mccook" if state=="south dakota" & county=="mc cook" (1 real change made) . replace county="mccormick" if state=="south carolina" & county=="mc cormick" (3 real changes made) . replace county="mccracken" if state=="kentucky" & county=="mc cracken" (7 real changes made) . replace county="mccreary" if state=="kentucky" & county=="mc creary" (1 real change made) . replace county="mcculloch" if state=="texas" & county=="mc culloch" (1 real change made) . replace county="mccurtain" if state=="oklahoma" & county=="mc curtain" (3 real changes made) . replace county="mcdonald" if state=="missouri" & county=="mc donald" (1 real change made) . replace county="mcdonough" if state=="illinois" & county=="mc donough" (3 real changes made) . replace county="mcdowell" if state=="north carolina" & county=="mc dowell" (1 real change made) . replace county="mcdowell" if state=="west virginia" & county=="mc dowell" (3 real changes made) . replace county="mcduffie" if state=="georgia" & county=="mc duffie" (3 real changes made) . replace county="mchenry" if state=="illinois" & county=="mc henry" (1 real change made) . replace county="mchenry" if state=="north dakota" & county=="mc henry" (1 real change made) . replace county="mcintosh" if state=="georgia" & county=="mc intosh" (3 real changes made) . replace county="mcintosh" if state=="north dakota" & county=="mc intosh" (1 real change made) . replace county="mcintosh" if state=="oklahoma" & county=="mc intosh" (3 real changes made) . replace county="mckean" if state=="pennsylvania" & county=="mc kean" (3 real changes made) . replace county="mckenzie" if state=="north dakota" & county=="mc kenzie" (1 real change made) . replace county="mckinley" if state=="new mexico" & county=="mc kinley" (9 real changes made) . replace county="mclean" if state=="illinois" & county=="mc lean" (4 real changes made) . replace county="mclean" if state=="kentucky" & county=="mc lean" (1 real change made) . replace county="mclean" if state=="north dakota" & county=="mc lean" (1 real change made) . replace county="mclennan" if state=="texas" & county=="mc lennan" (9 real changes made) . replace county="mcleod" if state=="minnesota" & county=="mc leod" (1 real change made) . replace county="mcminn" if state=="tennessee" & county=="mc minn" (3 real changes made) . replace county="mcmullen" if state=="texas" & county=="mc mullen" (1 real change made) . replace county="mcnairy" if state=="tennessee" & county=="mc nairy" (1 real change made) . replace county="mcpherson" if state=="kansas" & county=="mc pherson" (1 real change made) . replace county="mcpherson" if state=="nebraska" & county=="mc pherson" (1 real change made) . replace county="mcpherson" if state=="south dakota" & county=="mc pherson" (1 real change made) . replace county="o'brien" if state=="iowa" & county=="o brien" (1 real change made) . replace county="orleans" if county=="orleans, coex. with new orleans city" & > state=="louisiana" (3 real changes made) . replace county="philadelphia" if county=="philadelphia, coex. with philaoelp > hia city" & state=="pennsylvania" (3 real changes made) . replace county="san francisco" if county=="san francisco, coex. with san fra > ncisco city." & state=="california" (3 real changes made) . . *check that county names are consistent with 1970 census, except for known d > eviations . preserve . egen countyr_state=concat(county state), punct("_") . sort countyr_state . merge countyr_state using original_census1970_counties variable countyr_state does not uniquely identify observations in the master data variable countyr_state does not uniquely identify observations in original_census1970_counties.dta . assert _m==3|county=="armstrong" & state=="south dakota"|county=="washington > " & state=="south dakota"|county=="yellowstone national park (part)" & state > =="idaho"|county=="yellowstone national park, part" & state=="idaho"|county= > ="yellowstone national park" & state=="idaho"|county=="yellowstone national > park (part)" & state=="montana"|county=="yellowstone national park (part)" & > state=="wyoming"|county=="yellowstone national park (total)" & state=="wyom > ing"|county=="yellowstone national park," & state=="wyoming" |county=="yello > wstone national park" & state=="montana"|county=="park (excl yell nat park)" > & state=="montana"|county=="yellowstone national park, part" & state=="mont > ana"|county=="yellowstone national park, part" & state=="wyoming"|county=="y > ellowstone national park, part" & state=="montana"|county=="yellowstone nati > onal park, total" & state=="wyoming"|county=="park" & state=="montana"|count > y=="yellowstone nat. park (part)" & state=="idaho"|county=="yellowstone nat. > park (part)" & state=="wyoming"|county=="yellowstone nat. park (total)" & s > tate=="wyoming"|county=="new york city" & state=="new york"|county=="bronx" > & state=="new york"|county=="kings" & state=="new york"|county=="new york" & > state=="new york"|county=="queens" & state=="new york"|county=="richmond" & > state=="new york"|county=="ormsby" & state=="nevada"|county=="carson city" > & state=="nevada"|county=="los alamos" & state=="new mexico"|county=="menomi > nee" & state=="wisconsin"|county=="total"|state=="alaska"|state=="hawaii"|st > ate=="virginia"|state=="dc" . restore . . *check for mistakes/misspellings in state names . assert state=="alabama"|state=="alaska"|state=="arizona"|state=="arkansas"|s > tate=="california"|state=="colorado"|state=="connecticut"|state=="delaware"| > state=="district of columbia"|state=="florida"|state=="georgia"|state=="hawa > ii"|state=="idaho"|state=="illinois"|state=="indiana"|state=="iowa"|state==" > kansas"|state=="kentucky"|state=="louisiana"|state=="maine"|state=="maryland > "|state=="massachusetts"|state=="michigan"|state=="minnesota"|state=="missis > sippi"|state=="missouri"|state=="montana"|state=="nebraska"|state=="nevada"| > state=="new hampshire"|state=="new jersey"|state=="new mexico"|state=="new y > ork"|state=="north carolina"|state=="north dakota"|state=="ohio"|state=="okl > ahoma"|state=="oregon"|state=="pennsylvania"|state=="rhode island"|state=="s > outh carolina"|state=="south dakota"|state=="tennessee"|state=="texas"|state > =="utah"|state=="vermont"|state=="virginia"|state=="washington"|state=="west > virginia"|state=="wisconsin"|state=="wyoming" . . *data checks for columns summing to county total . gen temp=births_by_place_of_residence__at+births_by_place_of_residence__a0 . list if temp!=births_by_place_of_residence__to +----------------------------------------------------------------+ 3634. | page__~_ | state | county | city_b~l | race | births~o | | 2-25 | louisiana | tensas | total | nonwhite | 163 | |----------------------+-----------------------------------------| | births~t | births~0 | temp | | 102 | 661 | 763 | +----------------------------------------------------------------+ . *checked .pdf, this is a data error not a data entry error . *looks like it should be births_of_residents_of_area__a0=61 instead > of births_of_residents_of_area__a0=661; changing this . replace births_by_place_of_residence__a0=61 if births_by_place_of_residence_ > _a0==661 & state=="louisiana" & county=="tensas" & city_balance_total=="tota > l" & race=="nonwhite" (1 real change made) . drop temp . . *clean and label variables . rename page__of_pdf_ page_of_pdf_ . label var page_of_pdf_ "page of pdf" . label var state "state" . label var county "county" . rename city_balance_total sub_county . label var sub_county "city/balance/total" . label var race "race" . rename births_by_place_of_residence__to births . label var births "births by residence" . rename births_by_place_of_residence__at births_h . label var births_h "births by residence: in hospital" . rename births_by_place_of_residence__a0 births_nh_ns . label var births_nh_ns "births by residence: not in hospital and not specifi > ed" . . *generate year variable . gen year=1967 . label var year "year" . . *check that observations are unique . egen tag=tag(state county sub_county race) . assert tag==1 . drop tag . . list in 1/1 +--------------------------------------------------------------------+ 1. | page_o~_ | state | county | sub_co~y | race | births | births_h | | 2-3 | alabama | total | total | total | 64659 | 57495 | |--------------------------------------------------------------------| | births~s | year | | 7164 | 1967 | +--------------------------------------------------------------------+ . desc Contains data from natality1967.dta obs: 9,313 vars: 9 size: 1,136,186 (89.2% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page_of_pdf_ str4 %4s page of pdf state str20 %20s state county str45 %45s county sub_county str27 %27s city/balance/total race str8 %8s race births long %12.0g births by residence births_h long %12.0g births by residence: in hospital births_nh_ns int %8.0g births by residence: not in hospital and not specified year float %9.0g year ------------------------------------------------------------------------------ > - Sorted by: Note: dataset has changed since last saved . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page_of_pdf_ | 0 state | 0 county | 0 sub_county | 0 race | 0 -------------+-------------------------------------------------------- births | 9313 1827.72 10448.1 0 337159 births_h | 9313 1796.649 10333.63 0 335090 births_nh_ns | 9313 31.07076 253.5793 0 9578 year | 9313 1967 0 1967 1967 . saveold clean_natality1967.dta,replace file clean_natality1967.dta saved . clear . . ** . *1968 data . ** . *http://nber15.nber.org/vital-stats-books/vsus_1968_1.cv.pdf . *table 2-1 . . *hand-checked excel file for ?'s (difficult-to-read characters) . *corrected below if any . . clear . use natality1968.dta . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page__of_p~_ | 0 state | 0 county | 0 city_balan~l | 0 race | 0 -------------+-------------------------------------------------------- births_by_~o | 9295 1820.649 10394.16 2 339760 births_by_~t | 9294 1794.257 10290.09 2 337368 births_by_~0 | 5543 44.52859 276.3973 2 8344 . desc Contains data from natality1968.dta obs: 9,313 vars: 8 size: 1,108,247 (89.4% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page__of_pdf_ str4 %4s state str20 %20s county str45 %45s city_balance_~l str27 %27s race str9 %9s births_by_pla~o long %12.0g births_by_pla~t long %12.0g births_by_pla~0 int %8.0g ------------------------------------------------------------------------------ > - Sorted by: . replace county=lower(county) (9313 real changes made) . replace state=lower(state) (9313 real changes made) . replace city_balance_total=lower(city_balance_total) (9313 real changes made) . replace race=lower(race) (9313 real changes made) . replace race="nonwhite" if race=="all other" (1709 real changes made) . . *missing observations in the original data are zeros . *replace these so we can distinguish these from "true" missing values . *which would be generated below due to difficult-to-read characters . count if births_by_place_of_residence__to==. 18 . replace births_by_place_of_residence__to=0 if births_by_place_of_residence__ > to==. (18 real changes made) . count if births_by_place_of_residence__at==. 19 . replace births_by_place_of_residence__at=0 if births_by_place_of_residence__ > at==. (19 real changes made) . count if births_by_place_of_residence__a0==. 3770 . replace births_by_place_of_residence__a0=0 if births_by_place_of_residence__ > a0==. (3770 real changes made) . . *check that all pdf pages appear to be in the data . gen temp=strpos(page__of_pdf_,"-") . gen newpagenumber=substr(page__of_pdf_,temp+1,.) . destring newpagenumber, replace newpagenumber has all characters numeric; replaced as byte . drop temp . sort newpagenumber . gen temp=newpagenumber[_n]-newpagenumber[_n-1] (1 missing value generated) . assert temp==0|temp==1|temp==. . drop temp newpagenumber . . *clean data entry errors . replace births_by_place_of_residence__at=286 if state=="kansas" & county==" > leavenworth" & city_balance_total=="balance of county" & race=="total" (1 real change made) . replace county="fairfax (ind. city)" if county=="fairfax" & state=="virginia > " & race=="total" & births_by_place_of_residence__to==438 (1 real change made) . replace county="franklin (ind. city)" if county=="franklin" & state=="virgin > ia" & race=="total" & births_by_place_of_residence__to==144 (1 real change made) . replace county="franklin (ind. city)" if county=="franklin" & state=="virgin > ia" & births_by_place_of_residence__to==68 (2 real changes made) . replace county="franklin (ind. city)" if county=="franklin" & state=="virgin > ia" & births_by_place_of_residence__to==76 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="total" & births_by_place_of_residence__to==3688 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="white" & births_by_place_of_residence__to==1472 (1 real change made) . replace county="richmond (ind. city)" if county=="richmond" & state=="virgin > ia" & race=="nonwhite" & births_by_place_of_residence__to==2216 (1 real change made) . replace county="roanoke (ind. city)" if county=="roanoke" & state=="virginia > " & race=="total" & births_by_place_of_residence__to==1668 (1 real change made) . replace county="franklin" if state=="virginia" & county=="franklin (ind. cit > y)" & city_balance_total=="total" & race=="nonwhite" & births_by_place_of_re > sidence__to==68 (1 real change made) . . *fix county name misspellings (relative to county names in the 1970 census) . replace county=subinstr(county,"st.","st",.) (206 real changes made) . replace county=subinstr(county,"ste.","st",.) (1 real change made) . replace county="cochrane" if county=="cochran" & state=="texas" (1 real change made) . replace county="davidson" if county=="davidson, coex. with nashville city" & > state=="tennessee" (3 real changes made) . replace county="desoto" if county=="de soto" & state=="florida" (3 real changes made) . replace county="denver" if county=="denver, coex. with denver city" & state= > ="colorado" (3 real changes made) . replace county="mcclain" if state=="oklahoma" & county=="mc clain" (1 real change made) . replace county="mccone" if state=="montana" & county=="mc cone" (1 real change made) . replace county="mccook" if state=="south dakota" & county=="mc cook" (1 real change made) . replace county="mccormick" if state=="south carolina" & county=="mc cormick" (3 real changes made) . replace county="mccracken" if state=="kentucky" & county=="mc cracken" (7 real changes made) . replace county="mccreary" if state=="kentucky" & county=="mc creary" (1 real change made) . replace county="mcculloch" if state=="texas" & county=="mc culloch" (1 real change made) . replace county="mccurtain" if state=="oklahoma" & county=="mc curtain" (3 real changes made) . replace county="mcdonald" if state=="missouri" & county=="mc donald" (1 real change made) . replace county="mcdonough" if state=="illinois" & county=="mc donough" (3 real changes made) . replace county="mcdowell" if state=="north carolina" & county=="mc dowell" (1 real change made) . replace county="mcdowell" if state=="west virginia" & county=="mc dowell" (3 real changes made) . replace county="mcduffie" if state=="georgia" & county=="mc duffie" (3 real changes made) . replace county="mchenry" if state=="illinois" & county=="mc henry" (1 real change made) . replace county="mchenry" if state=="north dakota" & county=="mc henry" (1 real change made) . replace county="mcintosh" if state=="georgia" & county=="mc intosh" (3 real changes made) . replace county="mcintosh" if state=="north dakota" & county=="mc intosh" (1 real change made) . replace county="mcintosh" if state=="oklahoma" & county=="mc intosh" (3 real changes made) . replace county="mckean" if state=="pennsylvania" & county=="mc kean" (3 real changes made) . replace county="mckenzie" if state=="north dakota" & county=="mc kenzie" (1 real change made) . replace county="mckinley" if state=="new mexico" & county=="mc kinley" (9 real changes made) . replace county="mclean" if state=="illinois" & county=="mc lean" (4 real changes made) . replace county="mclean" if state=="kentucky" & county=="mc lean" (1 real change made) . replace county="mclean" if state=="north dakota" & county=="mc lean" (1 real change made) . replace county="mclennan" if state=="texas" & county=="mc lennan" (9 real changes made) . replace county="mcleod" if state=="minnesota" & county=="mc leod" (1 real change made) . replace county="mcminn" if state=="tennessee" & county=="mc minn" (3 real changes made) . replace county="mcmullen" if state=="texas" & county=="mc mullen" (1 real change made) . replace county="mcnairy" if state=="tennessee" & county=="mc nairy" (1 real change made) . replace county="mcpherson" if state=="kansas" & county=="mc pherson" (1 real change made) . replace county="mcpherson" if state=="nebraska" & county=="mc pherson" (1 real change made) . replace county="mcpherson" if state=="south dakota" & county=="mc pherson" (1 real change made) . replace county="o'brien" if state=="iowa" & county=="o brien" (1 real change made) . replace county="orleans" if county=="orleans, coex. with new orleans city" & > state=="louisiana" (3 real changes made) . replace county="philadelphia" if county=="philadelphia, coex. with philaoelp > hia city" & state=="pennsylvania" (3 real changes made) . replace county="san francisco" if county=="san francisco, coex. with san fra > ncisco city." & state=="california" (3 real changes made) . . *check that county names are consistent with 1970 census, except for known d > eviations . preserve . egen countyr_state=concat(county state), punct("_") . sort countyr_state . merge countyr_state using original_census1970_counties variable countyr_state does not uniquely identify observations in the master data variable countyr_state does not uniquely identify observations in original_census1970_counties.dta . assert _m==3|county=="armstrong" & state=="south dakota"|county=="washington > " & state=="south dakota"|county=="yellowstone national park (part)" & state > =="idaho"|county=="yellowstone national park, part" & state=="idaho"|county= > ="yellowstone national park" & state=="idaho"|county=="yellowstone national > park (part)" & state=="montana"|county=="yellowstone national park (part)" & > state=="wyoming"|county=="yellowstone national park (total)" & state=="wyom > ing"|county=="yellowstone national park," & state=="wyoming" |county=="yello > wstone national park" & state=="montana"|county=="park (excl yell nat park)" > & state=="montana"|county=="yellowstone national park, part" & state=="mont > ana"|county=="yellowstone national park, part" & state=="wyoming"|county=="y > ellowstone national park, part" & state=="montana"|county=="yellowstone nati > onal park, total" & state=="wyoming"|county=="park" & state=="montana"|count > y=="yellowstone nat. park (part)" & state=="idaho"|county=="yellowstone nat. > park (part)" & state=="wyoming"|county=="yellowstone nat. park (total)" & s > tate=="wyoming"|county=="new york city" & state=="new york"|county=="bronx" > & state=="new york"|county=="kings" & state=="new york"|county=="new york" & > state=="new york"|county=="queens" & state=="new york"|county=="richmond" & > state=="new york"|county=="ormsby" & state=="nevada"|county=="carson city" > & state=="nevada"|county=="los alamos" & state=="new mexico"|county=="menomi > nee" & state=="wisconsin"|county=="total"|state=="alaska"|state=="hawaii"|st > ate=="virginia"|state=="dc" . restore . . *check for mistakes/misspellings in state names . assert state=="alabama"|state=="alaska"|state=="arizona"|state=="arkansas"|s > tate=="california"|state=="colorado"|state=="connecticut"|state=="delaware"| > state=="district of columbia"|state=="florida"|state=="georgia"|state=="hawa > ii"|state=="idaho"|state=="illinois"|state=="indiana"|state=="iowa"|state==" > kansas"|state=="kentucky"|state=="louisiana"|state=="maine"|state=="maryland > "|state=="massachusetts"|state=="michigan"|state=="minnesota"|state=="missis > sippi"|state=="missouri"|state=="montana"|state=="nebraska"|state=="nevada"| > state=="new hampshire"|state=="new jersey"|state=="new mexico"|state=="new y > ork"|state=="north carolina"|state=="north dakota"|state=="ohio"|state=="okl > ahoma"|state=="oregon"|state=="pennsylvania"|state=="rhode island"|state=="s > outh carolina"|state=="south dakota"|state=="tennessee"|state=="texas"|state > =="utah"|state=="vermont"|state=="virginia"|state=="washington"|state=="west > virginia"|state=="wisconsin"|state=="wyoming" . . *data checks for columns summing to county total . gen temp=births_by_place_of_residence__at+births_by_place_of_residence__a0 . assert temp==births_by_place_of_residence__to . drop temp . . *clean and label variables . rename page__of_pdf_ page_of_pdf_ . label var page_of_pdf_ "page of pdf" . label var state "state" . label var county "county" . rename city_balance_total sub_county . label var sub_county "city/balance/total" . label var race "race" . rename births_by_place_of_residence__to births . label var births "births by residence" . rename births_by_place_of_residence__at births_h . label var births_h "births by residence: in hospital" . rename births_by_place_of_residence__a0 births_nh_ns . label var births_nh_ns "births by residence: not in hospital and not specifi > ed" . . *generate year variable . gen year=1968 . label var year "year" . . *check that observations are unique . egen tag=tag(state county sub_county race) . assert tag==1 . drop tag . . list in 1/1 +--------------------------------------------------------------------+ 1. | page_o~_ | state | county | sub_co~y | race | births | births_h | | 2-3 | alabama | total | total | total | 63602 | 57478 | |--------------------------------------------------------------------| | births~s | year | | 6124 | 1968 | +--------------------------------------------------------------------+ . desc Contains data from natality1968.dta obs: 9,313 vars: 9 size: 1,145,499 (89.1% of memory free) ------------------------------------------------------------------------------ > - storage display value variable name type format label variable label ------------------------------------------------------------------------------ > - page_of_pdf_ str4 %4s page of pdf state str20 %20s state county str45 %45s county sub_county str27 %27s city/balance/total race str9 %9s race births long %12.0g births by residence births_h long %12.0g births by residence: in hospital births_nh_ns int %8.0g births by residence: not in hospital and not specified year float %9.0g year ------------------------------------------------------------------------------ > - Sorted by: Note: dataset has changed since last saved . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- page_of_pdf_ | 0 state | 0 county | 0 sub_county | 0 race | 0 -------------+-------------------------------------------------------- births | 9313 1817.13 10384.42 0 339760 births_h | 9313 1790.627 10279.9 0 337368 births_nh_ns | 9313 26.50295 214.3461 0 8344 year | 9313 1968 0 1968 1968 . saveold clean_natality1968.dta,replace file clean_natality1968.dta saved . clear . . log close log: /bbkinghome/molitor/afink/baby boom/births_data_nber/4_births_dat > a-uncleaned_stata/clean_natality_1940_1968.log log type: text closed on: 12 Jun 2008, 12:16:28 ------------------------------------------------------------------------------