Emptying the Tank: Getting the most out of Limited Data
All empirical researchers know that having more sources of variation in a dataset is valuable. What is not known is how valuable, and if the marginal value of adding another source of variation diminishes or increases. This note provides explicit answers to these questions. It defines "valuable" as the number of independent questions the data can potentially answer, and provides a surprisingly simple and useful rule that tells the researcher not only when they have "emptied the tank" of their data's valuable implications, but also the marginal value of further data collection. An illustration using home heating costs is provided.
Thanks to A. Magesan, A. Jacobsen, J. W. Laliberte, K. Head, F. Mayer, and A. Whalley for helpful comments, and J. Taylor-McGregor for help with the title. The views expressed herein are those of the author and do not necessarily reflect the views of the National Bureau of Economic Research.