Filling the Gaps with MICE: Addressing Missing Data in Real Estate Price Indices
Missing data are a common feature of micro-level transaction data used to construct hedonic real estate price indices. Because these data are typically collected for legal and tax-related purposes rather than for statistical analysis, missingness often arises in property characteristics required for hedonic quality adjustment rather than in transaction prices themselves. Since these characteristics are central to hedonic quality adjustment, standard approaches such as complete-case analysis can distort the measurement of price dynamics through sample-selection and composition effects. This paper proposes multiple imputation as a way to handle missing characteristic values in index construction.
The primary goal of this imputation process is not to recover individual missing values at the micro level, but to restore incomplete observations to the estimation sample and thereby stabilize quality adjustment. We use multiple-imputation-by-chained-equations (MICE) as a flexible imputation framework for this task.
Because the standard aggregation rules for multiple imputation (Rubin’s rules) are not consistent with the multiplicative chaining structure of price indices, we develop an alternative aggregation procedure based on pooled growth rates.
Empirically, we examine two applications: the large dataset of Vienna apartment transactions and the much smaller and more heterogeneous market for Austrian office unit transactions.
Our findings indicate that in a large and relatively uniform market, hedonic price indices tend to be robust to missing data in most scenarios. As a result, the differences between complete-case estimation and alternative imputation methods are minimal, although MICE generally performs best. In contrast, in smaller, more heterogeneous markets, imputation can significantly affect index dynamics, particularly when there are time-varying composition effects and substantial missing data in key descriptive variables. Across imputation methods, we observe that index results are similar when flexible multiple imputation methods with rich predictor sets are used. However, restricted linear regression or ad-hoc imputation rules perform less well.
Overall, the paper argues that missing data should be addressed more explicitly in the hedonic price index literature.
-
Copy CitationMiriam Steurer and Sabrina Spiegel, Measurement of Housing and the Housing Sector (University of Chicago Press, 2026), chap. 9, https://www.nber.org/books-and-chapters/measurement-housing-and-housing-sector/filling-gaps-mice-addressing-missing-data-real-estate-price-indices.Download Citation
-