Deep Research on a Loop: Using AI Agents to Construct Economic Datasets
Constructing datasets from primary sources is one of the costliest tasks in empirical economics. We propose Deep Research on a Loop (DRIL), a methodology that uses AI agents to assemble datasets from publicly available sources. DRIL applies a fixed research instrument across a mapped unit space (e.g., countries by years), with a two-stage architecture separating design from implementation. The instrument specifies variables and coding rules, an evidence policy governs sources and citations, and data quality mechanisms track gaps and uncertainty explicitly. We exercise DRIL on a 2025 update of the Global Tax Expenditures Database for eight Latin American and Caribbean countries. The run produces 129 sources and 136 evidence records, covering 22 qualitative fields fully and 6 quantitative estimate types with documented gaps, at the cost of a standard LLM subscription comparable to a few hours of research-assistant work. We argue that even partial automation of dataset construction can shift the production function of empirical economics.
-
-
Copy CitationSantiago Afonso, Sebastian Galiani, Ramiro H. Gálvez, and Raul A. Sosa, "Deep Research on a Loop: Using AI Agents to Construct Economic Datasets," NBER Working Paper 35188 (2026), https://doi.org/10.3386/w35188.Download Citation