RCTs to Scale: Comprehensive Evidence from Two Nudge Units
Nudge interventions have quickly expanded from academic studies to larger implementation in so-called Nudge Units in governments. This provides an opportunity to compare interventions in research studies, versus at scale. We assemble a unique data set of 126 RCTs covering over 23 million individuals, including all trials run by two of the largest Nudge Units in the United States. We compare these trials to a sample of nudge trials published in academic journals from two recent meta-analyses. In papers published in academic journals, the average impact of a nudge is very large – an 8.7 percentage point take-up effect, a 33.5% increase over the average control. In the Nudge Unit trials, the average impact is still sizable and highly statistically significant, but smaller at 1.4 percentage points, an 8.1% increase. We consider five potential channels for this gap: statistical power, selective publication, academic involvement, differences in trial features and in nudge features. Publication bias in the academic journals, exacerbated by low statistical power, can account for the full difference in effect sizes. Academic involvement does not account for the difference. Different features of the nudges, such as in-person versus letter-based communication, likely reflecting institutional constraints, can partially explain the different effect sizes. We conjecture that larger sample sizes and institutional constraints, which play an important role in our setting, are relevant in other at-scale implementations. Finally, we compare these results to the predictions of academics and practitioners. Most forecasters overestimate the impact for the Nudge Unit interventions, though nudge practitioners are almost perfectly calibrated.
We are very grateful to the Office of Evaluation Sciences and Behavioral Insights Team North America for supporting this project and for countless suggestions and feedback. We thank Johannes Abeler, Isaiah Andrews, Oriana Bandiera, Shlomo Benartzi, John Beshears, David Card, Benjamin Enke, Etan Green, Eric Johnson, Maximilian Kasy, David Laibson, George Loewenstein, Rachael Meager, Katherine Milkman, Gautam Rao, Adam Sacarny, Cass Sunstein, Richard Thaler, Eva Vivalt, Richard Zeckhauser and participants in seminars at BEAM 2020, ideas42, Harvard University, the LSE, the University of Chicago, the University of Pittsburgh, University of California, Berkeley, and the University of Zurich for helpful comments. We are grateful to Margaret Chen and Woojin Kim and a team of undergraduate research assistants at UC Berkeley for exceptional research assistance. The views expressed herein are those of the authors and do not necessarily reflect the views of the National Bureau of Economic Research.
Elizabeth Linos was employed by one of the two organizations who provided data for this project, three years prior to the launch of the project.