Microsoft Research New England
1 Memorial Drive
Cambridge MA 02142
NBER Working Papers and Publications
|March 2017||Text as Data|
with Matthew Gentzkow, Bryan T. Kelly: w23276
An ever increasing share of human interaction, communication, and culture is recorded as digital text. We provide an introduction to the use of text as an input to economic research. We discuss the features that make text different from other forms of data, offer a practical overview of relevant statistical methods, and survey a variety of applications.
|July 2016||Measuring Polarization in High-Dimensional Data: Method and Application to Congressional Speech|
with Matthew Gentzkow, Jesse M. Shapiro: w22423
We study trends in the partisanship of congressional speech from 1873 to 2016. We define partisanship to be the ease with which an observer could infer a congressperson’s party from a fixed amount of speech, and we estimate it using a structural choice model and methods from machine learning. Our method corrects a severe finite-sample bias that we show arises with standard estimators. The results reveal that partisanship is far greater in recent years than in the past, and that it increased sharply in the early 1990s after remaining low and relatively constant over the preceding century. Our method is applicable to the study of high-dimensional choices in many domains, and we illustrate its broader utility with an application to residential segregation.