This conference is supported by Grant #2010-10-17 from the Alfred P. Sloan Foundation
Varian considers the problem of short-term time series forecasting (nowcasting) when there are more possible predictors than observations. His approach combines three Bayesian techniques: Kalman filtering, spike-and-slab regression, and model averaging. He illustrates this approach using search engine query data as predictors for consumer sentiment and gun sales.
Baye, De Los Santos, and Wildenbeest provide a data-driven overview of different platforms where consumers can search for books and booksellers, and show how the use of these platforms has shifted over time. They highlight a number of challenges and open agenda items related to observed data on consumer search, as well as prices of digital and physical books.
Most data sources used in economics, whether from the government or businesses, are typically available only after a substantial lag, at a high level of aggregation, and for variables that were specified and collected in advance. This hampers the effectiveness of real-time predictions. Wu and Brynjolfsson demonstrate how data from search engines like Google provide an accurate but simple way to predict future business activities. Applying their methodology to predict housing market trends, they find that a housing search index is strongly predictive of the future housing market sales and prices. The use of search data produces out-of-sample predictions with a smaller mean absolute error than the baseline model that uses conventional data but does not include any search data. The improvements in predictions using search terms is 7.1 percent better over the baseline for future home sales and 4.6 percent better for future housing prices. Furthermore, they find that their simple model of using search frequencies beats the predictions made by experts from the National Association of Realtors by 23.6 percent for future U.S. home sales. They also demonstrate how these data can be used in other markets, such as laptop sales. In the near future, this type of "nanoeconomic" data can transform prediction in numerous markets, and thus business and consumer decisionmaking.
Online contract labor globalizes traditionally local labor markets, with platforms that enable employers, most of whom are in high-income countries, to more easily outsource tasks to contractors, primarily located in low-income countries. This market is growing rapidly; Agrawal, Horton, Lacetera, and Lyons provide descriptive statistics from one of the leading platforms where the number of hours worked increased 55% from 2011 to 2012, with the 2012 total wage bill just over $360 million. The researchers outline three lines of inquiry in this market setting that are central to the broader digitization research agenda: 1) How will the digitization of this market influence the distribution of economic activity (geographic distribution of work, income distribution, distribution of work across firm boundaries)?; 2) What is the magnitude and nature of information frictions in these digital market settings as reflected by user responses to market design features (allocation of visibility, investments in human capital acquisition, machine-aided recommendations)?; 3) How will the digitization of this market affect social welfare (increased efficiency in matching, production?)? Agrawal, Horton, Lacetera, and Lyons draw upon economic theory as well as evidence from empirical research on online contract labor markets and other related settings to motivate and contextualize this research agenda.
Although revenue for recorded music has collapsed since the explosion of file sharing, results elsewhere suggest that the quality of new music has not suffered. One possible explanation is that digitization has allowed a wider range of firms to bring far more music to market using lower-cost methods of production, distribution, and promotion. Record labels have traditionally found it difficult to predict which albums will find commercial success, so many released albums fail while many nascent but unpromoted albums might have been successful. Forces raising the number of products released may allow consumers to discover more appealing choices if they can sift through the offerings. Digitization has promoted both Internet radio and a growing cadre of online music reviewers, providing alternatives to radio airplay as means for new product discovery. To explore this, Waldfogel assembles data on new works of recorded music released between 1980 and 2010, along with data on particular albums' sales, airplay on both traditional and Internet radio, and album reviews at Metacritic since 2000. First, he documents that despite a substantial drop in major-label album releases, the total quantity of new albums released annually has increased sharply since 2000, driven by independent labels and purely digital products. Second, increased product availability has been accompanied by a reduction in the concentration of sales in the top albums. Third, new information channels - Internet radio and online criticism - change the number and kinds of products about which consumers have information. Fourth, in the past dozen years, increasing numbers of albums find commercial success without substantial traditional airplay. Finally, albums from independent labels - which previously might not have made it to market - account for a growing share of commercially successful albums.
The security of sensitive individual data is a subject of undisputable importance. One of the major threats to sensitive data arises when one can link sensitive information and publicly available data. Komarova, Nekipelov, and Yakovelv demonstrate that even if the sensitive data are never publicly released, the point estimates from the empirical model estimated from the combined public and sensitive data may lead to a disclosure of individual information. Their theory builds on the work in a 2012 paper in which they analyze the individual disclosure that arises from the releases of marginal empirical distributions of individual data. The disclosure threat in that case is posed by the possibility of a linkage between the released marginal distributions.Here, they analyze a different type of disclosure: they use the notion of the risk of statistical partial disclosure to measure the threat from the inference on sensitive individual attributes from the released empirical model that uses the data combined from the public and private sources. As their main example, they consider a treatment effect model in which the treatment status of an individual constitutes sensitive information.
Online advertising offers unprecedented opportunities for measurement. A host of new metrics, clicks being the leading example, have become widespread in advertising science. New data and experimentation platforms open the door for firms and researchers to measure true causal effects of advertising on a variety of consumer behaviors, such as purchases. Lewis, Rao, and Reiley dissect the new metrics and methods currently used by industry researchers, attacking the question, "How hard is it to reliably measure advertising effectiveness?" They outline the questions that they think can be answered by current data and methods, those that they believe will be in play within five years, and those that they believe could not be answered with arbitrarily large and detailed data. They pay close attention to the advances in computational advertising that are not only increasing the impact of advertising, but also usefully shifting the focus from "who to hit" to "what do I get."
As businesses and consumers search , communicate, and transact online, firms gather more and more personal and financial information. On the one hand, all this information can enhance market efficiency and consumer surplus, as firms tailor products to buyers. On the other hand, there is increased risk of information loss , either by accident or through theft . What issues should be on the digital agenda with regard to information loss , and what data are available to underpin both business response and any policy approach? Mann reviews the situation and points out where we need more thought and more data. She looks at: 1) Various frameworks for analysis, such as "How should we model the information marketplace, particularly with regard to the benefits and costs of information collection, retention, and aggregation?" 2) Quantification and data: What is the evidence on the prevalence and nature of information loss , what are the costs of information loss, and how valuable is this information in the marketplace ? 3) Market and Policy Response: What do we know about the efficacy of market versus other approaches to disciplining market participants, either to avoid loss or remediate after information loss? Throughout, of particular interest is the international dimension of information loss. What issues arise when countries differ in their attitudes and policies toward information acquisition, aggregation, retention , and, importantly , in disclosure of information lost?
In addition to the conference paper, the research was distributed as NBER Working Paper w19526, which may be a more recent version.
In a 2011 paper, Gentzkow and Shapiro use individual and aggregate data to evaluate the extent of ideological segregation in the consumption of online news. Using standard metrics of segregation, they find that ideological segregation of online news consumption is low in absolute terms, higher than the segregation of most offline news consumption, and significantly lower than the segregation of face-to-face interactions with neighbors, co-workers, or family members. Here they consider the structure of supply and demand that might give rise to the observed patterns. They present preliminary evidence on the structure of consumer preferences and supplier incentives from some simple structural models of news demand.
In addition to the conference paper, the research was distributed as NBER Working Paper w19675, which may be a more recent version.
Ideology and Online News
Understanding Media Markets in the Digital Age: Economics and Methodology
What Are We Not Doing When We're Online
Bayesian Variable Selection for Nowcasting Economic Time Series
Information Lost (Apologies to Milton)
Digitization and the Contract Labor Market: A Research Agenda
Copyright and the Profitability of Authorship: Evidence from Payments to Writers in the Romantic Period
Measuring the Effects of Advertising: The Digital Frontier
Searching for Physical and Digital Media: The Evolution of Platforms for Finding Books