Non-Randomly Sampled Networks: Biases and Corrections

Chih-Sheng Hsieh; Stanley I. M. Ko; Jaromír Kovářík; Trevon Logan

doi:10.3386/w25270

Non-Randomly Sampled Networks: Biases and Corrections

Chih-Sheng Hsieh, Stanley I. M. Ko, Jaromír Kovářík & Trevon Logan

Working Paper 25270

DOI 10.3386/w25270

Issue Date November 2018

Revision Date July 2022

This paper analyzes statistical issues arising from networks based on non-representative samples of the population. We first characterize the biases in both network statistics and estimates of network effects under non-random sampling analytically and numerically. Sampled network data systematically bias the properties of population networks and suffer from non-classical measurement-error problems when applied as regressors. Apart from the sampling rate and the elicitation procedure, these biases depend in a nontrivial way on which subpopulations are missing with higher probability. We propose a methodology, adapting post-stratification weighting approaches to networked contexts, which enables researchers to recover several network-level statistics and reduce the biases in the estimated network effects. The advantages of the proposed methodology are that it can be applied to network data collected via both designed and non-designed sampling procedures, does not require one to assume any network formation model, and is straightforward to implement. We apply our approach to two widely used network data sets and show that accounting for the non-representativeness of the sample dramatically changes the results of regression analysis.

We are grateful to Isaiah Andrews, Aureo de Paula, Marco van der Leij, and participants at numerous seminars for comments and suggestions. Jaromír Kovářík acknowledges financial support from Ministerio de Economía y Competividad and Fondo Europeo de Desarrollo Regional (PID2019-108718GB-I00, PID2019-106146GB-I00), the Basque Government (IT1461-22), and the Grant Agency of the Czech Republic (21-22796S). The views expressed herein are those of the authors and do not necessarily reflect the views of the National Bureau of Economic Research.
Copy Citation

Chih-Sheng Hsieh, Stanley I. M. Ko, Jaromír Kovářík, and Trevon Logan, "Non-Randomly Sampled Networks: Biases and Corrections," NBER Working Paper 25270 (2018), https://doi.org/10.3386/w25270.

Download Citation

MARC RIS BibTeΧ
- November 14, 2018
- June 26, 2019

Non-Randomly Sampled Networks: Biases and Corrections

Related

Topics

Programs

More from the NBER