Privacy and Data-Based Research
What can we, as users of microdata, formally guarantee to the individuals (or firms) in our dataset, regarding their privacy? We retell a few stories, well-known in data-privacy circles, of failed anonymization attempts in publicly released datasets. We then provide a mostly informal introduction to several ideas from the literature on differential privacy, an active literature in computer science that studies formal approaches to preserving the privacy of individuals in statistical databases. We apply some of its insights to situations routinely faced by applied economists, emphasizing big-data contexts.
For useful comments on an early draft, we thank Dan Benjamin, Avrim Blum, Hank Greely, Aleksandra Korolova, Frank McSherry, Kobbi Nissim, Ted O'Donoghue, Grant Schoenebeck, Moses Shayo, Latanya Sweeney, Kunal Talwar, and Jonathan Ullman. Ligett's work was supported in part by an NSF CAREER award (CNS-1254169), the US-Israel Binational Science Foundation (grant 2012348), the Charles Lee Powell Foundation, a Google Faculty Research Award, and a Microsoft Faculty Fellowship. The views expressed herein are those of the authors and do not necessarily reflect the views of the National Bureau of Economic Research.
Ori Heffetz & Katrina Ligett, 2014. "Privacy and Data-Based Research," Journal of Economic Perspectives, American Economic Association, vol. 28(2), pages 75-98, Spring. citation courtesy of