Error Components in Grouped Data: Why It's Never Worth Weighting

William T. Dickens

doi:10.3386/t0043

Error Components in Grouped Data: Why It's Never Worth Weighting

William T. Dickens

Technical Working Paper 0043

DOI 10.3386/t0043

Issue Date February 1985

When estimating linear models using grouped data researchers typically weight each observation by the group size. Under the assumption that the regression errors for the underlying micro data have expected values of zero, are independent and are homoscedastic, this procedure produces best linear unbiased estimates. This note argues that for most applications in economics the assumption that errors are independent within groups is inappropriate. Since grouping is commonly done on the basis of common observed characteristics, it is inappropriate to assume that there are no unobserved characteristics in common. If group members have unobserved characteristics in common, individual errors will be correlated. If errors are correlated within groups and group sizes are large then heteroscedasticity may be relatively unimportant and weighting by group size may exacerbate heteroscedasticity rather than eliminate it. Two examples presented here suggest that this may be the effect of weighting in most non-experimental applications. In many situations unweighted ordinary least squares may be a preferred alternative. For those cases where it is not, a maximum likelihood and an asymptotically efficient two-step generalized least squares estimator are proposed. An extension of the two-step estimator for grouped binary data is also presented.

MARC RIS BibTeΧ

Error Components in Grouped Data: Why It's Never Worth Weighting

Published Versions

Related

Topics

Programs

More from the NBER