Scanner Data and Price IndexesNBER Reporter: Fall 2000
Scanner Data and Price Indexes
The topic of the most recent NBER Conference on Research in Income and Wealth, which took place in Arlington, Virginia, on September 15-16, was "Scanner Data and Price Indexes." Robert C. Feenstra, NBER and University of California, Davis, and Matthew D. Shapiro, NBER and University of Michigan, organized this program:
The Consumer Price Index (CPI) is the nation's primary measure of the price change in consumer goods and services. To produce the CPI, BLS staff observe prices in stores and other retail outlets, thus tracking the prices of samples of items in the various categories of consumer spending. In some of those categories, all or virtually all of the items have a manufacturer-supplied identifier, known as a Universal Product Code (UPC). Manufacturers print UPCs on their products in bar code format so that computer scanners can read them easily. Retailers assign prices by UPC for scanning at the checkout and to manage their inventories. Consequently, the retailers have computerized records of the prices and number of units they sell; these records are commonly called scanner data. Richardson explores the ways that the BLS can use scanner data in place of price data to calculate the CPI. Specifically, his paper is a progress report on a major CPI program initiative to construct scanner-based test indexes for breakfast cereal in the New York metropolitan area.
The monthly Retail Price Index is the main domestic measure of consumer inflation in the United Kingdom and is one of the highest profile outputs produced by the Office for National Statistics. The quality of the index depends significantly on how representative is the sample of retail outlets used to monitor prices and the choice of items for which prices are collected. Fenwick and Ball look at the scope for enhancing the quality of the index by using scanner data as a benchmark for checking the representative nature of the achieved sample. They then consider, in the context of traditional data collection methods, possible actions to improve the index by exploiting of information available from scanner data.
Lowe and Ruscher compare two different approaches -- the current practice at Statistics Canada and variants using scanner data -- for calculating price indexes for televisions. The authors examine the process of evaluating quality change for televisions over the past nine years and identify the main issues for improvement. They also look at the results based on using scanner data from 1997-9. None of the ways of handling the data seems entirely satisfactory to the authors without additional detailed examination of the micro data. Thus, they conclude that a practical use of scanner data will involve using only part of it.
Chevalier, Kashyap, and Rossi examine the retail and wholesale prices of a large supermarket chain in Chicago over a seven and a half year period. They show that prices tend to fall during the seasonal peak in demand for a product and that changes in retail margins explain some of those price changes. Thus, their results add to the growing body of evidence that markups are countercyclical. Manufacturer behavior plays a more limited role in the countercyclical nature of prices.
Shapiro and Feenstra examine high frequency (weekly) data on canned tuna to determine whether this can be used to construct meaningful price indexes. The authors construct two different types of weekly price indexes. The first -- a fixed-base index -- compares each week in 1993 to the modal price in 1992, using as weights the average 1992 sales at the modal price. This fixed-based Laspeyres index corresponds to that traditionally used in the CPI. The second type of index, a chained formula, updates the weights continuously and cumulates period-by-period changes in the price indexes to get long-term changes. Shapiro and Feenstra find that the difference between these two indexes is large: the chained Törnqvist has a pronounced upward bias for some regions of the United States. One explanation for this pattern is that consumers are purchasing goods for inventory accumulation. The authors therefore investigate the extent to which the weekly purchases of tuna are consistent with inventory behavior and find some statistical support for this hypothesis.
Barsky, Bergen, Dutta, and Levy investigate the size of markups for nationally branded products sold in the U.S. retail grocery industry. They treat the price of the comparable private label product as an upper bound for the marginal costs faced by the branded manufacturer. Using scanner data from a large midwestern grocery chain, they estimate the markup ratios for over 200 products in 19 categories. This data includes not only the prices and quantities sold by UPC, but also the retailers' margins on each product, allowing the authors to measure the markup ratios for national brands based on wholesale prices. The authors find that markup ratios measured this way range from 2.5 for crackers and 2.3 for analgesics to 1.2 for canned tuna; the majority of markups range from 1.4 to 1.7. The authors also find that retailers' markups are generally lower for nationally branded products than for private labels. The net effect is that markup ratios measured using only retail price data will understate the markups for nationally branded products.
Petrin and Goolsbee examine the welfare gains attributable to the introduction of Direct Broadcast Satellite, the alternative to cable television. Using micro data on the television viewing habits of 35,000 people and the prices and characteristics of cable companies throughout the nation, they explore the efficiency gains from using consumer-level relative to market-level data as well as the gains from observing data on utilization. Their results suggest that the introduction of home satellites created a significant amount of consumer surplus to those who adopted the technology. At current prices, the own-price elasticity of satellites is well above one, as is the cross-price elasticity of satellites with respect to the price of cable. Interestingly, the estimated price elasticity for cable is significantly less than one, suggesting that the threat of regulation may have kept prices below what cable companies would charge in an unregulated market.
Within the next few years, a number of the best-selling U.S. prescription pharmaceuticals -- Prilosec, Prozac, Pepcid, and Claritin, for example -- are likely to face patent expiration. Will only switches from prescription drug to nonprescription over-the-counter (Rx to OTC) occur? If so, what will be their effects on prices and utilization? And does the Rx to OTC switch significantly mitigate the effects of Rx patent expiration? To answer such questions, Berndt, Ling, Kyle, and Finkelstein focus on three main sets of policy instruments: 1) pricing and marketing strategies by branded pioneer drug manufacturers on their Rx drugs pre- and post-patent expiration, including both traditional physician-oriented marketing and direct-to-consumer marketing; 2) the impact of generic entry on the price, utilization, and revenues of the molecule post-patent expiration; and 3) the effects of Rx to OTC switches on cannibalization of same-brand Rx sales, and on total brand sales.
When constructing a price index that truly measures the cost of living, one needs to account for the welfare effects of the changing product mix over time. Many past studies ignored this issue and computed indexes using a unit-value approach. In this paper, Bradley establishes a method for generating a price index that accounts for the welfare effects of product exit and entry. He applies this method to cereal purchases in New York and compares it to the unit value method. As applied to cereal, Bradley's index is most often below the index generated with unit values.
The CPI Commission found that the current CPI overstates the cost of living by about 1.1 percentage points and its report pointed to improper treatment of new products and quality changes as major causes of this bias. Nevo considers the construction of an alternative price index based on an estimated demand system. In principle, this method could produce a price index that accounts for introduction of new products and for quality changes in existing products. Then, using estimates of a brand-level demand system for ready-to-eat cereal, Nevo calculates that, depending on the interpretation of the demand estimates, his price index can range from a 35 percent increase over the five years examined to a 0.6 percent decrease.
Silver and Heravi consider three approaches to estimating quality-adjusted price changes based on scanner data: the dummy variable approach from a hedonic regression; a superlative or exact hedonic index (SEHI) approach; and a matching technique. The dummy variable approach has been used to provide independent estimates of quality changes. However, the availability of scanner data provides an opportunity to use data on the prices (unit values), volumes, and quality characteristics of a much wider range of transactions, and to consider less restrictive methods than the dummy variable approach. The authors also consider the practical compilation of CPIs when quality adjustment is necessary to insure that quality differences do not mar the price comparison of the new variety with the old variety of products.
Hawkes and Piotrowski claim that the shift to the more accurate scanner data in marketing research offers opportunities for significant quality improvements in official price index construction -- for comparing prices across both time and countries. In addition to improving the quality of measurement, these data provide an opportunity for the use of hedonic analysis of detailed product characteristics. The authors discuss these improvements and present year-to-year price index trends for an entire CPI "item stratum" using scanner data. They conclude with a discussion of data aggregation issues in CPI construction.
These papers and their discussions will be published by the University of Chicago Press in an NBER Conference Volume.