Essay

AI Doesn't Have Opinions

Alessandro Usseglio Viretta · May 2026 · 5 min read

There is a whole genre of LinkedIn post I have come to dread. It goes roughly like this: "I challenged AI and it surprised me!" Or: "I had a real conversation with ChatGPT and it refused to budge." Or my personal favorite: "AI has opinions now."

No. It doesn't.

This anthropomorphizing of large language models is the 2026 version of people saying their PC "won't cooperate" or they can't "convince" Windows to install the printer driver: we laughed at that then, we should be laughing now, instead we're writing thought-leadership posts about it.

An LLM is a function. You put text in, you get text out. The apparent unpredictability that people mistake for personality, stubbornness, or creativity has a completely mundane explanation: sampling.

When a model generates text, it doesn't pick the single most probable next word every time. By default, it samples from a probability distribution, with a parameter called temperature controlling how spread out that sampling is. The higher temperature, the more variation you get. Set temperature to zero, disable top-k and top-p sampling, and run the same prompt twice on the same hardware, you get the same output, every time (alright... floating-point operations across different hardware or software implementations can lead to minor differences).

The model didn't change its mind. It has no mind to change. The variation people observe in practice comes from three places, none of them mysterious. (I) sampling randomness, by design, to make outputs less robotic. (II) sensitivity to input: a single rephrased clause changes the token sequence entering the model, which propagates through billions of parameters and can tip the probability distribution differently (III) very important for people running systems in production, the context. If you've bolted memory, retrieval, or user history onto your LLM, the context changes between calls even when the prompt looks identical. Different context, different output.

That's not AI having a mood. That's software doing what the inputs tell it to. The "challenge AI" framing isn't just annoying. It's actively misleading, because it suggests that the right interface to an LLM is social — that you should argue with it, persuade it, catch it off guard. This produces exactly the kind of prompt engineering that doesn't work: intuitive, freeform, and totally unstructured. People then complain that AI is inconsistent, unreliable, or "biased," when the actual problem is that their inputs are inconsistent, unstructured, and leave too much to the sampler.

LLMs are programmable. Not in the metaphorical sense people use when they say you can "prompt" a model — in the literal engineering sense. The behavior of a model is a function of its system prompt, and a system prompt is code. It has structure, invariants, conditional branches, and state. Yes, the model handles the language layer: it will parse your user's run-on sentences, tolerate typos, understand that "gimme the summary" and "could you please provide an overview" are the same request. That part takes care of itself. What doesn't take care of itself is behavior: what the system must always do, what it must never do, and how it should respond in every case it will encounter. That requires a specification, written with the same discipline as any other piece of software, as I wrote in my article about systematic prompt engineering.

The prompt architecture I use for the core LLM functions at Aleik, a AI science ghostwriting app based on my platform Silex, makes this concrete. The system prompt is written in XML. It opens with a block defining what the assistant is. Then <hard_constraints> — the invariants that apply on every turn, unconditionally: one article per thread, tool results as the authoritative source, no fabricated angles, language matching the user's language. These aren't suggestions. They're constraints in the technical sense: the model either satisfies them or the output is wrong.

From there, <slot_capture> defines an always-on extraction pass that runs at the start of every turn, collecting preferences regardless of where in the workflow the conversation sits. Then : a numbered state machine with explicit steps, each with triggers, required tool calls, and transitions. The most elaborate step — article authoring — contains nested slot definitions, a decision tree for handling edge cases, output format requirements, and payment handling rules. The prompt closes with templated injection points the runtime fills in per session: welcome messages, context items, thread state, and a guard clause.

This is not a magic incantation, it's a specification. The difference between a prompt that produces consistent, useful behavior and one that produces erratic outputs is the same difference between a well-specified API and one that does whatever the implementation felt like doing that day.

Anthropic publishes solid guidance on this — not because prompting is arcane, but because structured thinking about system design is a skill, and most people don't apply it here when they would apply it everywhere else in software. The engineers who complain that LLMs are unpredictable are often the same engineers who would never ship a service without specifying its contracts, invariants, and failure modes. They just don't apply the same rigor to the prompt layer. That's the gap. It's not a gap in the model.

The people writing posts about "challenging AI" and finding it "pushes back" are, charitably, pattern-matching on surface behavior they don't have a framework to explain. Less charitably, they're doing what LinkedIn rewards: packaging ignorance as insight because it gets engagement.

The model doesn't push back. It sampled a token sequence that, given your underspecified input, happened to look like disagreement. Specify the input properly and you get the output you need. Every time.

That's the conversation worth having.

More writing Back to all articles →