Factorial Designs, Model Selection, and (Incorrect) Inference in Randomized Experiments

Karthik Muralidharan; Mauricio Romero; Kaspar Wüthrich

doi:10.3386/w26562

Factorial Designs, Model Selection, and (Incorrect) Inference in Randomized Experiments

Karthik Muralidharan, Mauricio Romero & Kaspar Wüthrich

Working Paper 26562

DOI 10.3386/w26562

Issue Date December 2019

Revision Date September 2020

Factorial designs are widely used for studying multiple treatments in one experiment. While t-tests based on the “long” model (including main and interaction effects) provide valid inferences against “business-as-usual” counterfactuals, “short” model t-tests (that ignore interactions) yield higher power if the interactions are zero, but incorrect inferences otherwise. Out of 27 factorial experiments published in top-5 journals in 2007–2017, 19 use the short model. We reanalyze these experiments, and show that over half of their published results lose significance when interactions are included. We show that testing the interactions using the long model and presenting the short model if the interactions are not significantly different from zero leads to incorrect inference due to the implied data-dependent model selection. Based on recent econometric advances, we show that local power improvements over the long model are possible. However, if the main effects are of primary interest, leaving the interaction cells empty yields valid inferences and global power improvements. In addition, the sample size needed to detect interactions is substantially larger than that required to detect main effects, resulting in most experiments being under-powered to detect interactions. Thus, using factorial designs to explore whether interactions are meaningful can be problematic because interaction estimates are likely to considerably overestimate the magnitude of the true effect conditional on being significant.

We are grateful to Isaiah Andrews, Tim Armstrong, Prashant Bharadwaj, Arun Chandrasekhar, Clement de Chaisemartin, Gordon Dahl, Stefano DellaVigna, Esther Duflo, Graham Elliott, Markus Goldstein, Macartan Humphreys, Hiroaki Kaido, Lawrence Katz, Michal Kolesar, Soonwoo Kwon, Adam McCloskey, Craig McIntosh, Rachael Meager, Paul Niehaus, Ben Olken, Gautam Rao, Andres Santos, Jesse Shapiro, Diego Vera-Cossio, and several seminar participants for comments and suggestions. We are also grateful to the authors of the papers we reanalyze for answering our questions and fact-checking that their papers are characterized correctly. Sameem Siddiqui provided excellent research assistance. All errors are our own. Financial support from the Asociación Mexicana de Cultura, A.C. is gratefully acknowledged by Romero. The views expressed herein are those of the authors and do not necessarily reflect the views of the National Bureau of Economic Research.
MARC RIS BibTeΧ
- online appendix
- December 12, 2019

Factorial Designs, Model Selection, and (Incorrect) Inference in Randomized Experiments

Related

Topics

Programs

Conferences

More from NBER