Big Data in the U.S. Consumer Price Index: Experiences and Plans
The Bureau of Labor Statistics (BLS) has generally relied on its own sample surveys to collect the price and expenditure information necessary to produce the Consumer Price Index (CPI). The burgeoning availability of big data has created information that could lead to methodological improvements and cost savings in the CPI. The BLS has undertaken several pilot projects in an attempt to supplement and/or replace its traditional field collection of price data with alternative sources. In addition to cost reductions, these projects have demonstrated the potential to expand sample size, reduce respondent burden, obtain transaction prices more consistently, and improve price index estimation by incorporating real-time expenditure information—a foundational component of price index theory that has not been practical until now. The CPI uses the term alternative data to refer to any data not collected through traditional field collection procedures by CPI staff, including third party datasets, corporate data, and data collected through web scraping or retailer API’s. This paper reviews how the CPI program is adapting to work with alternative data, followed by discussion of the three main sources of alternative data under consideration by the CPI with a description of research and other steps taken to date for each source.
Crystal G. Konny is the former Chief of Branch of Consumer Prices at the Bureau of Labor Statistics (BLS). Brendan K. Williams is a Senior Economist in the Branch of Consumer Prices. Prior to his retirement in February 2020, David M. Friedman served as the BLS Associate Commissioner for Prices and Living Conditions. Any opinions and conclusions expressed herein are those of the authors and do not necessarily represent the view of the U.S. Bureau of Labor Statistics. We thank Matthew Shapiro, Katherine Abraham, Kate Sosnowski, Kelley Khatchadourian, Jason Ford, Lyuba Rozental, Mark Bowman, Craig Brown, Nicole Shepler, Malinda Harrell, John Bieler, Dan Wang, Brian Parker, Sarah Niedergall, Jenny FitzGerald, Paul Liegey, Phillip Board, Rob Cage, Ursula Oliver, Mindy McAllister, Bob Eddy, Karen Ransom, and Steve Paben for their contributions. The views expressed herein are those of the authors and do not necessarily reflect the views of the National Bureau of Economic Research.