Artificial Writing and Automated Detection

Brian Jabarian; Alex Imas

doi:10.3386/w34223

Artificial Writing and Automated Detection

Brian Jabarian & Alex Imas

Working Paper 34223

DOI 10.3386/w34223

Issue Date September 2025

Artificial intelligence (AI) tools are increasingly used for written deliverables. This has created demand for distinguishing human-generated text from AI-generated text at scale, e.g., ensuring assignments were completed by students, product reviews written by actual customers, etc. A decision-maker aiming to implement a detector in practice must consider two key statistics: the False Negative Rate (FNR), which corresponds to the proportion of AI-generated text that is falsely classified as human, and the False Positive Rate (FPR), which corresponds to the proportion of human-written text that is falsely classified as AI-generated. We evaluate three leading commercial detectors—Pangram, OriginalityAI, GPTZero—and an open-source one —RoBERTa—on their performance in minimizing these statistics using a large corpus spanning genres, lengths, and models. Commercial detectors outperform open-source, with Pangram achieving near-zero FNR and FPR rates that remain robust across models, threshold rules, ultra-short passages, "stubs" (≤ 50 words) and ’humanizer’ tools. A decision-maker may weight one type of error (Type I vs. Type II) as more important than the other. To account for such a preference, we introduce a framework where the decision-maker sets a policy cap—a detector-independent metric reflecting tolerance for false positives or negatives. We show that Pangram is the only tool to satisfy a strict cap (FPR ≤ 0.005) without sacrificing accuracy. This framework is especially relevant given the uncertainty surrounding how AI may be used at different stages of writing, where certain uses may be encouraged (e.g., grammar correction) but may be difficult to separate from other uses.

We thank Kevin Bryan for helpful feedback. Ziyue Feng and Andrew James provided excellent research assistance. All remaining errors are our own. Brian Jabarian gratefully acknowledges funding support from the University of Chicago Booth Center for Applied Artificial Intelligence, the Becker-Friedman Institute Program in Behavioral Economics Research and the Google Cloud Research Program. Disclosure. All authors declare that they have no financial or personal conflicts of interest related to this study. The views expressed herein are those of the authors and do not necessarily reflect the views of the National Bureau of Economic Research.
Copy Citation

Brian Jabarian and Alex Imas, "Artificial Writing and Automated Detection," NBER Working Paper 34223 (2025), https://doi.org/10.3386/w34223.

Download Citation

MARC RIS BibTeΧ

Artificial Writing and Automated Detection

Related

Topics

Working Groups

More from the NBER