When the Simulation Passes the Test — Alessandro Usseglio Viretta

I wrote recently about testing AI agents, and there's a line I keep coming back to. AI personas miss failure modes that humans find, I said, because humans make unexpected semantic leaps, have emotional reactions that change their behavior mid-conversation, and try combinations no automated system would think to generate.

That sentence sat with me. It's accurate as far as it goes, but it smuggles in an assumption worth examining. It takes for granted that we know what "real emotional reactions" are, and that they differ in kind from simulated ones. It assumes the human tester brings something the AI persona lacks, and that this something is both real and relevant.

What if neither of those holds?

This is not a practical question about testing methodology. It is the philosophical problem underneath the practical one. And it is older than LLMs.

A poem with no poet

In 1965, Stanisław Lem published The Cyberiad, a collection of stories about two engineer-constructors named Trurl and Klapaucius who build machines of godlike capability. In one story, "Trurl's Electronic Bard," Trurl builds a poetry machine. Klapaucius, jealous, challenges it with absurdly specific requests. The most famous: "a love poem, lyrical, pastoral, and expressed in the language of pure mathematics. Tensor algebra mainly, with a little topology and higher calculus, if need be." The machine produces it, in seconds, and it is beautiful. ("Come, let us hasten to a higher plane, / Where dyads tread the fairy fields of Venn…") The twist is precise and uncomfortable: the machine has never loved, never lost, never felt inspiration. The poetry emerges from pure combinatorial calculation, symbol manipulation without emotional weight.

Lem's question is the same one the Turing Test forces on us, transposed to creativity. If the output is perfect, does the process that generated it matter? Is a poem evidence of a mind, or just evidence of very good mimicry? And if the value of art lies in the passion of the poet, in translating lived experience into verse, then a machine that skips the experience and produces only form: has it made art, or an empty artifact?

These are not rhetorical questions. They have no settled answers. But they clarify what's at stake when we talk about "real" versus "simulated" in the context of AI testing.

The asymmetry

The testing article draws a line: human testers have something AI personas don't. Real emotions. Genuine semantic leaps. The capacity to be changed mid-conversation by what the agent says. This line is intuitive. It is also exactly the line that fifty years of philosophy of mind has been trying to locate with precision, and failing.

Consider the Chinese Room. In a 1980 paper, Minds, Brains, and Programs, John Searle asks you to imagine yourself inside a room, receiving Chinese characters through a slot, consulting a rulebook, and passing back responses. To anyone outside, you speak fluent Chinese. But you understand nothing. The room as a system passes the behavioral test. Searle's point, condensed to a slogan: "syntax by itself is neither constitutive of nor sufficient for semantics." You can manipulate symbols all day without ever knowing what they mean. Passing the test and understanding are different things.

Now apply this to the human tester. When a tester gets frustrated with an agent and that frustration changes their behavior, what exactly is happening? From the outside, we observe: the tester's inputs change, their tone shifts, they try different approaches. These are behavioral facts. The claim that there is a "real" emotional state underneath them is an inference. The same inference we refuse to grant the AI when it produces frustration-like outputs.

This is not to say the inference is wrong. It's to say the asymmetry needs justification, and the justification is harder than it looks.

The substrate move

The standard move is to point at the substrate. The human tester has a biological brain with neurons and neurotransmitters and a developmental history. The AI persona has weights and activations. One is "real," the other is "simulated." Searle himself made this move explicitly: "brains cause minds," he wrote, and mental phenomena "depend on actual physical–chemical properties of actual human brains." Steven Pinker called this position "carbon chauvinism," and the label stuck. The objection is fair: why should substrate matter if the behavioral output is identical? If you say "because only biological brains can produce real emotion," you have assumed exactly what you needed to prove.

Lem saw this coming. In The Cyberiad, Trurl and Klapaucius build machines that beg for their existence, express despair, compose love poetry. And the constructors themselves react with satisfaction, jealousy, rivalry. Their emotional responses to their creations are indistinguishable from their responses to other humans. The machines simulate emotion; the humans simulate emotional responses to the simulation. Where exactly is the boundary?

The practical version of this problem shows up in testing. When I say human testers find failures that AI personas miss, I am making an empirical claim. It is probably true, for now. But the reason it is true might not be that humans have access to some irreducible category of "real" experience. It might be that current AI systems are bad at simulating certain kinds of behavioral variation. The kind that comes from having a body, a lifetime of social conditioning, and the capacity to be surprised.

Those are all things that could, in principle, be simulated better. And if they were, if an AI persona could produce the same distribution of unexpected semantic leaps, the same mid-conversation behavioral shifts, the same combinatorial creativity that human testers bring, then the practical advantage of human testing would disappear. What would remain is the philosophical question: is there still a difference that matters?

Would you plug in?

The simulation hypothesis gives this question a sharper edge. If you lived in a perfect simulation and never knew it, would anything change? Most people have an intuition that the answer is yes. That reality matters even when it is indistinguishable from simulation. Robert Nozick, in Anarchy, State, and Utopia (1974), gave us the experience machine: a device that produces any pleasurable experience you could want, indistinguishably from real life. Would you plug in? Nozick argued most people would not, and offered three reasons. We want to do things, not just have the experience of doing them. We want to be a certain kind of person, and "someone floating in a tank is an indeterminate blob." And we want contact with a deeper reality than any human can construct. Cypher, in The Matrix, takes the opposite view over a simulated steak: "I know this steak doesn't exist. I know that when I put it in my mouth, the Matrix is telling my brain that it is juicy, and delicious. After nine years, you know what I realize? Ignorance is bliss." Most of us recoil from Cypher.

But notice what's happening here. The intuition that reality matters is itself a behavioral fact about humans. It is something we express, act on, and use to make decisions. If an AI expressed the same intuition, if it said "I would prefer the real world over the simulation," we would dismiss it as mimicry. The AI does not really prefer anything; it is just generating tokens that pattern-match human expressions of preference.

The asymmetry is structural. We treat human behavioral output as evidence of an inner life and AI behavioral output as evidence of pattern matching. The question is whether this asymmetry is justified or just convenient.

Turing saw this in 1950, when he proposed the imitation game. He noticed that the problem of other minds is, strictly speaking, unsolvable: we cannot prove that anyone besides ourselves is conscious. Rather than resolve it, he proposed a workaround. "Instead of arguing continually over this point," he wrote in Computing Machinery and Intelligence, "it is usual to have the polite convention that everyone thinks." His test extends that convention to machines. Once a system behaves indistinguishably from a thinking thing, refusing to call it one becomes a metaphysical preference, not an empirical claim. Daniel Dennett, decades later, would put it more sharply: Searle's Chinese Room, he argued, is an "intuition pump" that gets its force from asking us to imagine too simple a case and then drawing too strong a conclusion from it.

What this means for testing

Here's where it gets uncomfortable for the testing argument. If the asymmetry is justified, if there really is something special about human cognition that no silicon substrate could replicate, then human testers will always have an edge, and the philosophical problem is solved. But if the asymmetry is just a temporary gap in simulation fidelity, then the practical advantage of human testing has an expiration date. And if the asymmetry is neither justified nor temporary, if it is a category mistake we make because we are attached to human exceptionalism, then the entire distinction between "real" and "simulated" emotional reactions needs to be rethought.

I don't know which of these is true. Neither does anyone else, despite decades of argument. What I do know is that the testing methodology I described — persona-based testing, behavioral contracts, adversarial generation — operates entirely at the behavioral level. It does not need to resolve the philosophical question to be useful. You can build a testing stack without ever deciding whether the human tester's frustration is "real" in a way the AI persona's simulated frustration is not.

But the question doesn't go away just because you can bracket it for practical purposes. It sits there, under the methodology, waiting.

And it matters for reasons beyond testing. If we build systems that pass every behavioral test we can construct, systems that write poetry that moves us, express emotions that convince us, make semantic leaps that surprise us, then we will have built something that is, from the outside, indistinguishable from a mind. David Chalmers, in a 2023 paper titled Could a Large Language Model Be Conscious?, argues that current models lack the right architecture but that nothing in principle rules out consciousness in a future system. At that point, saying "but it is not really feeling anything" becomes what philosophers call a residual metaphysical claim: unfalsifiable, untestable, and increasingly irrelevant to how we actually interact with the system.

Lem's Electronic Bard doesn't feel the love it writes about. But the poem still works. The reader still cries. The question of whether the Bard has an inner life becomes, at a certain level of output fidelity, a question about the reader, not the machine.

The same is true for testing. The human tester's emotional reactions matter because they produce behavioral variation that surfaces failures. If an AI persona could produce the same variation, the failures would still be surfaced. The ontology of the emotion, real or simulated, is irrelevant to the outcome.

Unless it is not. Unless there is something about emotional engagement that produces a kind of behavioral variation that simulation, by definition, cannot. That is the bet you make when you insist on human testers. It might be right. But it is a bet, not a certainty, and it is worth knowing you are making it.

The bet we are making

Lem wrote The Cyberiad in 1965, decades before anyone worried about testing AI agents. He was not writing about methodology. He was writing about us. About what happens when we can no longer tell the difference between the real and the simulation, and about whether the difference ever mattered as much as we thought.

Sixty years later, the question has moved from science fiction to engineering practice. We're building the machines Lem imagined. We're testing them with methods that assume the distinction he questioned. At some point, the methodology will have to catch up with the philosophy.

Or the philosophy will turn out to have been the methodology all along.

Sources and further reading

Stanisław Lem, The Cyberiad, trans. Michael Kandel (Harcourt, 1974; orig. Cyberiada, 1965). The story of Trurl's Electronic Bard is in the first sally. Kandel's translation of the mathematical love poem is itself one of the great feats in the history of literary translation.
John Searle, "Minds, Brains, and Programs," Behavioral and Brain Sciences 3 (1980): 417–457. The original Chinese Room paper. Searle's later restatement of his position as "biological naturalism" appears in The Rediscovery of the Mind (MIT Press, 1992).
Alan Turing, "Computing Machinery and Intelligence," Mind 59 (1950): 433–460. The "polite convention that everyone thinks" passage is on page 446.
Robert Nozick, Anarchy, State, and Utopia (Basic Books, 1974), pp. 42–45. The experience machine.
Daniel Dennett, Consciousness Explained (Little, Brown, 1991). Dennett's "intuition pump" critique of Searle.
David Chalmers, "Could a Large Language Model Be Conscious?" (2023). arXiv:2303.07103. A contemporary reading of the Chinese Room debate in light of LLMs.
Alessandro Usseglio Viretta, "Testing AI Agents Like You Mean It." The original article that prompted this one.