SOURCE AND ACCURACY STATEMENT FOR THE 1990 SIPP PUBLIC USE FILES SOURCE OF DATA The data were collected in the 1990 panel of the Survey of Income and Program Participation (SIPP). The SIPP universe is the noninstitutionalized resident population living in the United States. The population includes persons living in group quarters, such as dormitories, rooming houses, and religious group dwellings. Crew members of merchant vessels, Armed Forces personnel living in military barracks, and institutionalized persons, such as correctional facility inmates and nursing home residents, were not eligible to be in the survey. Also, United States citizens residing abroad were not eligible to be in the survey. Foreign visitors who work or attend school in this country and their families were eligible; all others were not eligible to be in the survey. With the exceptions noted above, persons who were at least 15 years of age at the time of the interview were eligible to be in the survey. The 1990 panel of the SIPP sample is located in 230 Primary Sampling Units (PSUs) each consisting of a county or a group of contiguous counties. Within these PSUs, expected clusters of 2 living quarters (LQs) were systematically selected from lists of addresses prepared for the 1980 decennial census to form the bulk of the sample. To account for LQs built within each of the sample areas after the 1980 census, a sample was drawn of permits issued for construction of residential LQs up until shortly before the beginning of the panel. In jurisdictions that do not issue building permits, small land areas were sampled and the LQs within were listed by field personnel and then clusters of 4 LQs were subsampled. In addition, sample LQs were selected from supplemental frames that included LQs identified as missed in the 1980 Census and persons residing in group quarters at the time of the Census. The 1990 panel differs from the other panels as a result of oversampling for low income. The oversample was constructed by taking a small subsample from the 1989 panel, and combing it with the 1990 panel. Variables such as race, ethnicity, and sex were used for the oversampling since low income data for 1989 panel households were unavailable. The 1989 panel subsample contains all Black Headed Households, all Hispanic Headed Households, all Households with Heads having no spouse present, living with relatives, and a random sample of all the other Household types. The latter random sample was done in an attempt to avoid bias in the sample. Approximately 28,300 living quarters were designated for the 1990 panel. For Wave 1 of the 1990 panel, interviews were obtained from the occupants of about 21,900 of the 28,300 designated living quarters. Most of the remaining 6,400 living quarters in the 1990 panel were found to be vacant, demolished, converted to nonresidential use, or otherwise ineligible for the survey. However, approximately 1,700 of the 6,400 living quarters in the 1990 panel were not interviewed because the occupants refused to be interviewed, could not be found at home, were temporarily absent, or were otherwise unavailable. Thus, occupants of about 93 percent of all eligible living quarters participated in Wave 1 of the Survey for the 1990 panel. Sample loss at Wave 1 of the 1990 Panel was about 7.1% and is expected to increase to roughly 22.0% at the end of Wave 8. For Waves 2-8, only original sample persons (Those in Wave 1 sample households and interviewed in Wave 1) and persons living with them were eligible to be interviewed. With certain restrictions, original sample persons were to be followed if they moved to a new address. When original sample persons moved without leaving a forwarding address or moved to extremely remote parts of the country and no telephone number was available, additional noninterviews resulted. Sample households within a given panel are divided into four subsamples of nearly equal size. These subsamples are called rotation groups 1, 2, 3, or 4 and one rotation group is interviewed each month. Each household in the sample was scheduled to be interviewed at 4 month intervals over a period of roughly 2 years beginning in February 1990. The reference period for the questions is the 4-month period preceding the interview month. In general, one cycle of four interviews covering the entire sample, using the same questionnaire, is called a wave. A unique feature of the SIPP design is overlapping panels. The overlapping design allows panels to be combined and essentially doubles the sample sizes. However, the 1990 panel is designed so that the first three waves do not overlap with other panels. (The 1988 and 1989 panels were prematurely terminated to provide the funding needed to enlarge the 1990 panel and allow oversampling to take place.) After the third wave, the 1990 panel overlaps with the 1991 panel. Selected interviews for the 1990 panel can be combined with interviews from the 1991 panel. Information necessary to do this is included later in this statement. The public use files include core and supplemental (topical module) data. Core questions are repeated at each interview over the life of the panel. Topical modules include questions which are asked only in certain waves. The 1990 and 1991 panel topical modules are given in tables 1 and 2 respectively. Tables 3 and 4 indicate the reference months and interview months for the collection of data from each rotation group for the 1990 and 1991 panels respectively. For example, Wave 1 rotation group 2 of the 1990 panel was interviewed in February 1990 and data for the reference months October 1989 through January 1990 were collected. Estimation. The estimation procedure used to derive SIPP person weights involved several stages of weight adjustments. Each person received a base weight equal to the inverse of his/her probability of selection. A noninterview adjustment factor was applied to the weight of every occupant of interviewed households to account for households which were eligible for the sample but were not interviewed. (Individual nonresponse within partially interviewed households was treated with imputation. No special adjustment was made for noninterviews in group quarters.) A factor was applied to each interviewed person's weight to account for the SIPP sample areas not having the same population distribution as the strata from which they were selected. An additional stage of adjustment to persons' weights was performed to reduce the mean square error of the survey estimates by ratio adjusting SIPP sample estimates to monthly Current Population Survey (CPS) estimates of the civilian (and some military) noninstitutional population of the United States by age, race, Spanish origin, sex, type of householder (married, single with relatives, single without relatives), and relationship to householder (spouse or other). The CPS estimates were themselves brought into agreement with estimates from the 1980 decennial census which were adjusted to reflect births, deaths, immigration, emigration, and changes in the Armed Forces since 1980. Also, an adjustment was made so that a husband and wife within the same household were assigned equal weights. Use of Weights. Users should be forewarned to apply the appropriate weights given on this file before attempting to calculate estimates. The weights vary between units due to the oversampling that took place. If analysis is done for the general population without applying the appropriate weights, the results will be erroneous. Each household and each person within each household on each wave tape has five weights. Four of these weights are reference month specific and therefore can be used only to form reference month estimates. Reference month estimates can be averaged to form estimates of monthly averages over some period of time. For example, using the proper weights, one can estimate the monthly average number of households in a specified income range over November and December 1990. To estimate monthly averages of a given measure (e.g., total, mean) over a number of consecutive months, sum the monthly estimates and divide by the number of months. The remaining weight is interview month specific. This weight can be used to form estimates that specifically refer to the interview month (e.g., total persons currently looking for work), as well as estimates referring to the time period including the interview month and all previous months (e.g., total persons who have ever served in the military). To form an estimate for a particular month, use the reference month weight for the month of interest, summing over all persons or households with the characteristic of interest whose reference period includes the month of interest. Multiply the sum by a factor to account for the number of rotations contributing data for the month. This factor equals four divided by the number of rotations contributing data for the month. For example, December 1989 data is only available from rotations 2, 3, and 4 for Wave 1 of the 1990 panel (See table 3), so a factor of 4/3 (See Table 7) must be applied. To form an estimate for an interview month, use the procedure discussed above using the interview month weight provided on the file. When estimates for months without four rotations worth of data are constructed from a wave file, factors greater than 1 must be applied. However, when core data from consecutive waves are used together, data from all four rotations may be available, in which case the factors are equal to 1. These tapes contain no weight for characteristics that involve a person's or household's status over two or more months (e.g., number of households with a 50 percent increase in income between November and December 1990). Producing Estimates for Census Regions and States. The total estimate for a region is the sum of the state estimates in that region. Using this sample, estimates for individual states are subject to very high variance and are not recommended. The state codes on the file are primarily of use for linking respondent characteristics with appropriate contextual variables (e.g., state-specific welfare criteria) and for tabulating data by user- defined groupings of states. Producing Estimates for the Metropolitan Population. For Washington, DC and 11 states, metropolitan or non-metropolitan residence is identified (variable H*-METRO). In 34 additional states, where the non-metropolitan population in the sample was small enough to present a disclosure risk, a fraction of the metropolitan sample was recoded to be indistinguishable from non- metropolitan cases (H*-METRO=2). In these states, therefore, the cases coded as metropolitan (H*-METRO=1) represent only a subsample of that population. In producing state estimates for a metropolitan characteristic, multiply the individual, family, or household weights by the metropolitan inflation factor for that state, presented in table 5. (This inflation factor compensates for the subsampling of the metropolitan population and is 1.0 for the states with complete identification of the metropolitan population.) The same procedure applies when creating estimates for particular identified MSA's or CMSA's--apply the factor appropriate to the state. For multi-state MSA's, use the factor appropriate to each state part. For example, to tabulate data for the Washington, DC-MD-VA MSA, apply the Virginia factor of 1.0521 to weights for residents of the Virginia part of the MSA; Maryland and DC residents require no modification to the weights (i.e., their factors equal 1.0). In producing regional or national estimates of the metropolitan population, it is also necessary to compensate for the fact that no metropolitan subsample is identified within two states (Mississippi and West Virginia) and one state-group (North Dakota - South Dakota - Iowa). Thus, factors in the right-hand column of table 5 should be used for regional and national estimates. The results of regional and national tabulations of the metropolitan population will be biased slightly. However, less than one-half of one percent of the metropolitan population is not represented. Producing Estimates for the Non-Metropolitan Population. State, regional, and national estimates of the non-metropolitan population cannot be computed directly, except for Washington, DC and the 11 states where the factor for state tabulations in table 5 is 1.0. In all other states, the cases identified as not in the metropolitan subsample (METRO=2) are a mixture of non- metropolitan and metropolitan households. Only an indirect method of estimation is available: first compute an estimate for the total population, then subtract the estimate for the metropolitan population. The results of these tabulations will be slightly biased. Combined Panel Estimates. Both the 1990 and 1991 panels provide data for October 1990-August 1992. Thus, estimates for these time periods may be obtained by combining the corresponding panels. However, since the Wave 1 questionnaire differs from the subsequent waves' questionnaire, we recommend that estimates not be obtained by combining Wave 1 data of the 1991 panel (collected February - May of 1991) with data of the 1990 panel. In this case, use the estimate obtained from either panel. Additionally, even for other waves, care should be taken when combining data from two panels since questionnaires for the two panels differ somewhat and since the length of time in sample for interviews from the two panels differ. Combined panel estimates may be obtained either (1) by combining estimates derived separately for the two panels or (2) by first combining data from the two files and then producing an estimate. 1. Combining Separate Estimates Corresponding estimates from two consecutive year panels can be combined to create joint estimates by using the formula ^ ^ ^ J = WJ1 + (1-W)J2 (A) ^ J = joint estimate (total, mean, proportion, etc); ^ J1 = estimate from the earlier panel; ^ J2 = estimate from the later panel; W = weighting factor of the earlier panel. To combine the 1990 and 1991 panels use a W value of 0.608 unless one of the panels contributes no information to the estimate. In that case, the panel contributing information receives a factor of 1. The other receives a factor of zero. 2. Combining Data from Separate Files Start by first creating a file containing the data from the two panel files. Apply the weighting factor, W, to the weight of each person from the earlier panel and apply (1-W) to the weight of each person from the later panel. Estimates can then be produced using the same methodology as used to obtain estimates from a single panel. Illustration for computing combined panel estimate. Suppose SIPP estimates for Wave 5 of the 1990 panel show that there were 441,000 households with monthly December income above $6000. Also, suppose SIPP estimates for Wave 2 of the 1991 panel show that there were 435,000 households with monthly December income above $6000. Using formula (A), the joint level estimate is ^ J = (0.608)(441,000) + (0.392)(435,000) = 438,648 ACCURACY OF THE ESTIMATES SIPP estimates obtained from public use files are based on a sample; they may differ somewhat from the figures that would have been obtained if a complete census had been taken using the same questionnaire, instructions, and enumerators. There are two types of errors possible in an estimate based on a sample survey: nonsampling and sampling. The magnitude of SIPP sampling error can be estimated, but this is not true of nonsampling error. Found below are descriptions of sources of SIPP nonsampling error, followed by discussions of sampling error, its estimation, and its use in data analysis. More detailed discussions of the existence and control of nonsampling errors in the SIPP can be found in the Quality Profile for the Survey of Income and Program Participation, May 1990, by Jabine, assisted by King and Petroni. Nonsampling Variability. Nonsampling errors can be attributed to many sources, e.g., inability to obtain information about all cases in the sample, definitional difficulties, differences in the interpretation of questions, inability or unwillingness on the part of the respondents to provide correct information, inability to recall information, errors made in collection such as in recording or coding the data, errors made in processing the data, errors made in estimating values for missing data, biases resulting from the differing recall periods caused by the rotation pattern used and failure to represent all units within the universe (undercoverage). Quality control and edit procedures were used to reduce errors made by respondents, coders and interviewers. Undercoverage in SIPP results from missed living quarters and missed persons within sample households. It is known that undercoverage varies with age, race, and sex. Generally, undercoverage is larger for males than for females and larger for blacks than for nonblacks. Ratio estimation to independent age- race-sex population controls partially corrects for the bias due to survey undercoverage. However, biases exist in the estimates to the extent that persons in missed households or missed persons in interviewed households have different characteristics than the interviewed persons in the same age-race-Spanish origin-sex group. Further, the independent population controls used have not be adjusted for undercoverage. Some respondents do not respond to some of the questions. Therefore, the overall nonresponse rate for some items such as income and other money related items is higher than the nonresponse rates presented on page 2. The Bureau uses complex techniques to adjust the weights for nonresponse, but the success of these techniques in avoiding bias is unknown. Comparability With Other Statistics. Caution should be exercised when comparing data from these files with data from other SIPP products or with data from other surveys. The comparability problems are caused by sources such as the seasonal patterns for many characteristics, definitional differences, and different nonsampling errors. Sampling Variability. Standard errors indicate the magnitude of the sampling variability. They also partially measure the effect of some nonsampling errors in response and enumeration, but do not measure any systematic biases in the data. The standard errors for the most part measure the variations that occurred by chance because a sample rather than the entire population was surveyed. Confidence Intervals. The sample estimate and its standard error enable one to construct confidence intervals, ranges that would include the average result of all possible samples with a known probability. For example, if all possible samples were selected, each of these being surveyed under essentially the same conditions and using the same sample design, and if an estimate and its standard error were calculated from each sample, then: 1. Approximately 68 percent of the intervals from one standard error below the estimate to one standard error above the estimate would include the average result of all possible samples. 2. Approximately 90 percent of the intervals from 1.6 standard errors below the estimate to 1.6 standard errors above the estimate would include the average result of all possible samples. 3. Approximately 95 percent of the intervals from two standard errors below the estimate to two standard errors above the estimate would include the average result of all possible samples. The average estimate derived from all possible samples is or is not contained in any particular computed interval. However, for a particular sample, one can say with a specified confidence that the average estimate derived from all possible samples is included in the confidence interval. Hypothesis Testing. Standard errors may also be used for hypothesis testing, a procedure for distinguishing between population parameters using sample estimates. The most common types of hypotheses tested are 1) the population parameters are identical versus 2) they are different. Tests may be performed at various levels of significance, where a level of significance is the probability of concluding that the parameters are different when, in fact, they are identical. To perform the most common hypothesis test, compute the difference XA - XB, where XA and XB are sample estimates of the parameters of interest. A later section explains how to derive an estimate of the standard error of the difference XA - XB. Let that standard error be Sdiff. If XA - XB is between -1.6 times Sdiff and +1.6 times Sdiff, no conclusion about the parameters is justified at the 10 percent significance level. If on the other hand, XA - XB is smaller than -1.6 times Sdiff or larger than +1.6 times Sdiff, the observed difference is significant at the 10 percent level. In this event, it is commonly accepted practice to say that the parameters are different. Of course, sometimes this conclusion will be wrong. When the parameters are, in fact, the same, there is a 10 percent chance of concluding that they are different. Note when using small estimates. Because of the large standard errors involved, there is little chance that summary measures would reveal useful information when computed on a smaller base than 200,000. Also, care must be taken in the interpretation of small differences. For instance, in case of a borderline difference, even a small amount of nonsampling error can lead to a wrong decision about the hypotheses, thus distorting a seemingly valid hypothesis test. Standard Error Parameters and Tables and Their Use. Most SIPP estimates have greater standard errors than those obtained through a simple random sample because clusters of living quarters are sampled. To derive standard errors that would be applicable to a wide variety of estimates and could be prepared at a moderate cost, a number of approximations were required. Estimates with similar standard error behavior were grouped together and two parameters (denoted "a" and "b") were developed to approximate the standard error behavior of each group of estimates. These "a" and "b" parameters are used in estimating standard errors and vary by type of estimate and by subgroup to which the estimate applies. Table 6 provides base "a" and "b" parameters to be used for estimates obtained from core data and for some estimates from topical module data. These parameters are considered preliminary. Revised parameters are soon to follow. The factors provided in table 7 when multiplied by the base parameters of table 6 for a given subgroup and type of estimate give the "a" and "b" parameters for that subgroup and estimate type for the specified reference period. For example, the base "a" and "b" parameters for total number of households are -0.0000664 and 6,043, respectively. For Wave 1 the factor for October 1989 is 4.0000 since only 1 rotation month of data is available. So, the "a" and "b" parameters for total household income in October 1989 based on Wave 1 are -0.0002656 and 24,172, respectively. Also for Wave 1, the factor for the first quarter of 1990 is 1.2222 since 9 rotation months of data are available (rotations 1 and 4 provide 3 rotations months each, while rotations 2 and 3 provide 1 and 2 rotation months, respectively). So, the "a" and "b" parameters for total number of households in the first quarter of 1990 are -0.0000812 and 7,386, respectively for Wave 1. The "a" and "b" parameters may be used to calculate the standard error for estimated numbers and percentages. Because the actual standard error behavior was not identical for all estimates within a group, the standard errors computed from these parameters provide an indication of the order of magnitude of the standard error for any specific estimate. Methods for using these parameters for computation of approximate standard errors are given in the following sections. For those users who wish further simplification, we have also provided preliminary general standard errors in tables 8 through 11 for making estimates with the use of data from all four rotations. Note that these standard errors must be adjusted by a factor (f) from table 6. The standard errors resulting from this simplified approach are less accurate. Methods for using these parameters and tables for computation of standard errors are given in the following sections. Standard errors provided in tables 8 through 11 will change when revised parameters are available. For the 1990, 1991 combined panel parameters, multiply the parameters in table 6 by the forthcoming appropriate factor from table 15. The factors later provided in table 16 adjust parameters for the number of rotation months available for a given estimate. These factors, when multiplied by the combined panel parameters derived from table 6 for a given subgroup and type of estimate, give the "a" and "b" parameters for that subgroup and estimate type for the specified combined reference period. For calculating 1990 topical module variances, table 12 is designated to later provide base "a" and "b" parameters. Table 13 also in the near future will provide base "a" and "b" parameters for computing the 1990, 1991 combined panel topical module variances. These parameters will also be provided when revised generalized variance parameters are available. Procedures for calculating standard errors for the types of estimates most commonly used are described below. Note specifically that these procedures apply only to reference month estimates or averages of reference month estimates. Refer to the section "Use of Weights" for a more detailed discussion of the construction of estimates. Stratum codes and half sample codes are included on the tapes to enable the user to compute the variances directly by methods such as balanced repeated replications (BRR). William G. Cochran provides a list of references discussing the application of this technique. (See Sampling Techniques, 3rd Ed., New York: John Wiley and Sons, 1977, p. 321.) Standard Errors of estimated numbers. The approximate standard error, sx, of an estimated number of persons, households, families, unrelated individuals and so forth, can be obtained in two ways. Both apply when data from all four rotations are used to make the estimate. However, only the second method should be used when less than four rotations of data are available for the estimate. Note that neither method should be applied to dollar values. It may be obtained by the use of the formula Sx = fs (1) where f is the appropriate "f" factor from table 6, and s is the standard error on the estimate obtained by interpolation from table 8 or 9. Alternatively, Sx may be approximated by the formula _________ | 2 Sx = \| ax + bx (2) from which the standard errors in tables 8 and 9 were calculated. Here x is the size of the estimate and "a" and "b" are the parameters associated with the particular type of characteristic being estimated. Use of formula 2 will provide more accurate results than the use of formula 1. Illustration. Suppose SIPP estimates for Wave 1 of the 1990 panel show that there were 472,000 households with monthly household income above $6,000. The appropriate parameters and factor from table 6 and the appropriate general standard error from table 8 are a = -0.0000664 b = 6,043 f = 1.00 s = 53,300 Using formula 1, the approximate standard error is Sx = 53,300 Using formula 2, the approximate standard error is _________________________________________ | 2 \|(-0.0000664)(472,000) + (6,043)(472,000) = 53,300 Using the standard error based on formula 2, the approximate 90- percent confidence interval as shown by the data is from 387,000 to 557,000. Therefore, a conclusion that the average estimate derived from all possible samples lies within a range computed in this way would be correct for roughly 90% of all samples. Illustration for computing standard errors for combined panel estimates. Will be provided when combining factors are available. Standard Error of a Mean. A mean is defined here to be the average quantity of some item (other than persons, families, or households) per person, family or household. For example, it could be the average monthly household income of females age 25 to 34. The standard error of a mean can be approximated by formula 3 below. Because of the approximations used in developing formula 3, an estimate of the standard error of the mean obtained from this formula will generally underestimate the true standard error. The formula used to estimate the standard _ error of a mean X is _______ _ | 2 Sx = \|(b/y)S (3) 2 where y is the size of the base, S is the estimated population variance of the item and b is the parameter associated with the particular type of item. 2 The population variance S may be estimated by one of two methods. In both methods we assume xi is the value of the item for unit i. (Unit may be person, family, or household). To use the first method, the range of values for the item is divided into c intervals. The upper and lower boundaries of interval j are Z and Z , respectively. Each unit is placed into one of c j-1 j groups such that Z < xi ó Z . j-1 j 2 The estimated population variance, S, is given by the formula: __c__ 2 \ 2 _2 S = /____ PjMj - X , (4) j=1 where Pj is the estimated proportion of units in group j, and Mj = (Z + Z ) /2. The most representative value of the item in j-1 j group j is assumed to be Mj. If group c is open-ended, i.e., no upper interval boundary exists, then an approximate value for Mc is Mc = 3/2 Z . c-1 _ The mean, X can be obtained using the following formula: __c__ _ \ X = /____ PjMj. j=1 In the second method, the estimated population variance is given by __n__ \ 2 /____ WiXi 2 i=1 _2 S = --------------- - X , (5) __n__ \ /____ Wi i=1 where there are n units with the item_of interest and Wi is the final weight for unit i. The mean, X, can be obtained from the formula __n__ \ /____ WiXi _ i=1 X = ------------ . __n__ \ /____ Wi i=1 2 When forming combined estimates using formula (A), S, given by formula (4), should be calculated by forming a distribution for each panel. The range of values for the item will be divided into intervals. Combined estimates for each interval can be obtained using formula (A). Formula (4) can be applied to the _ 2 combined distribution. To calculate X and S given by formula (5), replace Xi by WXi for Xi from the earlier panel and (1-W)Xi for Xi from the later panel. Illustration. Suppose that based on Wave 1 data, the distribution of monthly cash income for persons age 25 to 34 during the month of January 1988 is given in table 14. Using formula 4 and the mean monthly cash income of $2,530 the 2 approximate population variance, S , is 2 2 2 S = (1,371/39,851)(150) + (1,651/39,851)(450) +...+ 2 2 (1,493/39,851)(9,000) - (2,530) = 3,159,887. Using formula 3, the appropriate base "b" parameter _and factor from table 6, the estimated standard error of a mean X is _ _____________________________ Sx = \|(4,890/39,851,000)(3,159,887) = $20. Standard error of an aggregate. An aggregate is defined to be the total quantity of an item summed over all the units in a group. The standard error of an aggregate can be approximated using formula 6. As with the estimate of the standard error of a mean, the estimate of the standard error of an aggregate will generally underestimate the true standard error. Let y be the size of the 2 base, S be the estimated population variance of the item obtained using formula (4) or (5) and b be the parameter associated with the particular type of item. The standard error of an aggregate is: _________ | 2 Sx = \|(b)(y)S (6) Standard Errors of Estimated Percentages. The reliability of an estimated percentage, computed using sample data for both numerator and denominator, depends upon both the size of the percentage and the size of the total upon which the percentage is based. Estimated percentages are relatively more reliable than the corresponding estimates of the numerators of the percentages, particularly if the percentages are 50 percent or more, e.g., the percent of people employed is more reliable than the estimated number of people employed. When the numerator and denominator of the percentage have different parameters, use the parameter (and appropriate factor) of the numerator. If proportions are presented instead of percentages, note that the standard error of a proportion is equal to the standard error of the corresponding percentage divided by 100. There are two types of percentages commonly estimated. The first is the percentage of persons, families or households sharing a particular characteristic such as the percent of persons owning their own home. The second type is the percentage of money or some similar concept held by a particular group of persons or held in a particular form. Examples are the percent of total wealth held by persons with high income and the percent of total income received by persons on welfare. For the percentage of persons, families, or households, the approximate standard error, S(x,p), of the estimated percentage p can be obtained by the formula S(x,p) = fs (7) when data from all four rotations are used to estimate p. In this formula, f is the appropriate "f" factor from table 6 and s is the standard error of the estimate from table 10 or 11. Alternatively, it may be approximated by the formula _______________ S(x,p) = \| b/x(p)(100-p) (8) from which the standard errors in tables 10 and 11 were calculated. Here x is the size of the subclass of social units which is the base of the percentage, p is the percentage (0=99 2or98 5or95 10or90 25or75 50 200 1.73 2.43 3.79 5.20 7.50 8.70 300 1.41 1.99 3.09 4.26 6.20 7.10 500 1.09 1.54 2.40 3.30 4.76 5.50 750 0.89 1.26 1.96 2.69 3.89 4.49 1,000 0.77 1.09 1.69 2.33 3.37 3.89 2,000 0.55 0.77 1.20 1.65 2.38 2.75 3,000 0.45 0.63 0.98 1.35 1.94 2.24 5,000 0.35 0.49 0.76 1.04 1.51 1.74 7,500 0.28 0.40 0.62 0.85 1.23 1.42 10,000 0.24 0.34 0.54 0.74 1.06 1.23 15,000 0.20 0.28 0.44 0.60 0.87 1.00 25,000 0.15 0.22 0.34 0.47 0.67 0.78 30,000 0.14 0.20 0.31 0.43 0.61 0.71 40,000 0.12 0.17 0.27 0.37 0.53 0.61 50,000 0.11 0.15 0.24 0.33 0.48 0.55 60,000 0.10 0.14 0.22 0.30 0.43 0.50 80,000 0.09 0.12 0.19 0.26 0.38 0.43 90,000 0.08 0.11 0.18 0.25 0.35 0.41 1 To account for sample attrition, multiply the standard error of the estimate by 1.04 for estimates which include data from Wave 5 and beyond. Table 11. Standard Errors of Estimated Percentages of Persons Base of Estimated Estimated Percentages Percentage (Thousands) ó 1 or ò 99 2or98 5or95 10or90 25or75 50 200 2.97 4.17 6.50 9.00 12.90 14.90 300 2.42 3.41 5.31 7.30 10.50 12.20 600 1.71 2.41 3.75 5.20 7.50 8.60 1,000 1.33 1.87 2.91 4.00 5.80 6.70 2,000 0.94 1.32 2.06 2.83 4.08 4.71 5,000 0.59 0.83 1.30 1.79 2.58 2.98 8,000 0.47 0.66 1.03 1.41 2.04 2.36 11,000 0.40 0.56 0.88 1.21 1.74 2.01 13,000 0.37 0.52 0.81 1.11 1.60 1.85 17,000 0.32 0.45 0.70 0.97 1.40 1.62 22,000 0.28 0.40 0.62 0.85 1.23 1.42 26,000 0.26 0.37 0.57 0.78 1.13 1.31 30,000 0.24 0.34 0.53 0.73 1.05 1.22 50,000 0.19 0.26 0.41 0.57 0.82 0.94 80,000 0.15 0.21 0.32 0.45 0.65 0.75 100,000 0.13 0.19 0.29 0.40 0.58 0.67 130,000 0.12 0.16 0.25 0.35 0.51 0.58 220,000 0.09 0.13 0.20 0.27 0.39 0.45 230,000 0.09 0.12 0.19 0.26 0.38 0.44 1 To account for sample attrition, multiply the standard error of the estimate by 1.04 for estimates which include data from Wave 5 and beyond. Table 12. 1990 Topical Module Generalized Variance Parameters a b Fertility # Females (16+) Total -0.0000403 3,982 White -0.0000526 4,414 Black -0.0002431 2,878 Hispanic -0.0006864 4,851 Births (16+ females) Total -0.0000735 7,261 White -0.0000960 8,048 Black -0.0004432 5,248 Hispanic -0.0012518 8,847 Educational Attainment (16+) Wave 2 Total -0.0000286 5,424 White -0.0000372 6,012 Black -0.0001810 3,921 Hispanic -0.0002797 3,921 Wave 5 Total -0.0000312 5,913 White -0.0000405 6,553 Black -0.0001972 4,273 Hispanic -0.0003048 4,273 Marital Status and Person's Family Characteristics Some HH members (16+) Total -0.0000433 8,209 White -0.0000563 9,098 Black -0.0002738 5,933 Hispanics -0.0004232 5,933 All HH members (0+) Total -0.0000405 9,975 White -0.0000534 11,055 Black -0.0002374 7,209 Hispanic -0.0003478 7,209 Child Support (16+ females) Wave 3 Total -0.0000612 6,043 White -0.0000799 6,698 Black -0.0003698 4,368 Hispanic -0.0006180 4,368 Wave 6 Total -0.0000667 6,587 White -0.0000871 7,301 Black -0.0004021 4,761 Hispanic -0.0006736 4,761 Support for non-household members (16+) Wave 3 Total -0.0000319 6,043 White -0.0000414 6,698 Black -0.0002016 4,368 Hispanic -0.0003116 4,368 Wave 6 Total -0.0000347 6,587 White -0.0000452 7,301 Black -0.0002198 4,761 Hispanic -0.0003396 4,761 Health and Disability (0+) Total -0.0000318 7,818 White -0.0000419 8,666 Black -0.0001861 5,651 Hispanic -0.0002727 5,651 0-15 Child Care Wave 3 Total -0.0000867 4,890 White -0.0001195 5,420 Black -0.0004064 3,535 Hispanic -0.0008883 5,956 Wave 6 Total -0.0000945 5,331 White -0.0001303 5,908 Black -0.0004430 3,853 Hispanic -0.0009682 6,492 Welfare History and AFDC Both Sexes 18+ Total -0.0000783 14,344 White -0.0001016 15,898 Black -0.0005025 10,367 Hispanic -0.0007784 10,367 Males 18+ Total -0.0001638 14,344 White -0.0002112 15,898 Black -0.0011083 10,367 Hispanic -0.0015697 10,367 Females 18+ Total -0.0001501 14,344 White -0.0001959 15,898 Black -0.0009194 10,367 Hispanic -0.0015441 10,367 Table 13. Distribution of Monthly Cash Income Among Persons 25 to 34 years old Thousands in Percent with Interval at least as much as lower bound of Interval Total 39,851 --- under $300 1,371 100.0 $300 to $599 1,651 96.6 $600 to $899 2,259 92.4 $900 to $1,199 2,734 86.7 $1,200 to $1,499 3,452 79.9 $1,500 to $1,999 6,278 71.2 $2,000 to $2,499 5,799 55.5 $2,500 to $2,999 4,730 40.9 $3,000 to $3,499 3,723 29.1 $3,500 to $3,999 2,519 19.7 $4,000 to $4,999 2,619 13.4 $5,000 to $5,999 1,223 6.8 $6,000 and over 1,493 3.7 Table 14. Factors to be Applied to Base Parameters to Obtain Combined Panel Parameters for Estimates1 from Various Reference Periods. # of available rotation months for 2 panels combined2 factor ---------------------- ------ Monthly Estimate 2 4.0000 3 3.0000 4 2.0000 5 1.6667 6 1.3333 7 1.1667 8 1.0000 Quarterly Estimates 12 1.8519 15 1.5631 18 1.2222 19 1.1470 24 1.0000 Annual Estimates 96 1.0000 1 Estimates are based on monthly averages. 2 The number of available rotation months for a given estimate is the sum of the number of rotations available for each month of the estimate for the two panels. There must be at least one rotation month available for each month from each panel for monthly and quarterly estimates.