When a user shares multi-dimensional data about themselves with a firm, the firm learns about the correlations of different dimensions of user data. We incorporate this type of learning into a model of a data market in which a firm acquires data from users with privacy concerns. User data is multi-dimensional, and each user can share no data, only non-sensitive data, or their full data with the firm. As the firm collects more data and becomes better at drawing inferences about a user’s privacy-sensitive data from their non-sensitive data, the share of new users who share no data (“digital hermits”) grows. At the same time, the share of new users who share their full data also grows. The model therefore predicts a polarization of users’ data sharing choices away from non-sensitive data sharing to no sharing and full sharing.
We thank seminar and conference participants at the 2022 MaCCI Summer Institute in Competition Policy and USTC-UIUC. We also thank Zheng Gong for excellent research assistance. All mistakes are our own. Research support was provided by the Social Sciences and Humanities Research Council of Canada. Potential conflicts of interest are listed under "additional disclosures". The views expressed herein are those of the authors and do not necessarily reflect the views of the National Bureau of Economic Research.
Please see the authors' disclosure statements on the NBER website and on their home pages. Avi Goldfarb's disclosure notes that he has served as an expert on matters of competition and privacy in the ad tech space.Catherine Tucker
Please see this website for my list of current disclosures