Factorial Designs, Model Selection, and (Incorrect) Inference in Randomized Experiments
Factorial designs are widely used for studying multiple treatments in one experiment. While “long” model t-tests provide valid inferences, t-tests using the “short” model (ignoring interactions) yield higher power if interactions are zero, but incorrect inferences otherwise. Of 27 factorial experiments published in top-5 journals (2007--2017), 19 use the short model. After including all interactions, over half their results lose significance. Modest local power improvements over the long model are possible, but with lower power for most values of the interaction. If interactions are not of interest, leaving the interaction cells empty yields valid inferences and global power improvements.
You may purchase this paper on-line in .pdf format from SSRN.com ($5) for electronic delivery.
Document Object Identifier (DOI): 10.3386/w26562