Capability Inversion: The Turing Test Meets Information Design
This paper analyzes the design of tests to distinguish human from artificial intelligence through the lens of information design. We identify a fundamental asymmetry: while AI systems can strategically underperform to mimic human limitations, they cannot overperform beyond their capabilities. This leads to our main contribution—the concept of capability inversion domains, where AIs fail detection not through inferior performance, but by performing “suspiciously well” when they overestimate human capabilities. We show that if an AI significantly overestimates human ability in even one domain, it cannot reliably pass an optimally designed test. This insight reverses conventional intuition: effective tests should target not what humans do well, but the specific patterns of human imperfection that AIs systematically misunderstand. We identify structural sources of persistent misperception—including the difficulty of learning about failure from successful examples and fundamental differences in embodied experience—that make certain capability inversions exploitable for detection even as AI systems improve.