Hallucination

Ask ChatGPT for a list of novels about grief and you might get a perfectly curated reading list. You might also get a title that doesn’t exist, attributed to an author who never wrote it, complete with a confident one-sentence summary of a plot nobody imagined. The book sounds real. The author sounds right. Everything about the recommendation feels trustworthy, except that it’s entirely made up.

In AI, this is called a hallucination. And the name itself has a story worth telling.

What Hallucination Actually Means

When an AI model hallucinates, it generates information that sounds plausible but is factually wrong, fabricated, or nonsensical. The model isn’t lying (that would require knowing the truth and choosing to say otherwise). It has no concept of truth at all. It’s producing the next most statistically likely sequence of words based on patterns it absorbed during training, and sometimes those patterns lead somewhere convincing but false.

Think of it like a method actor who’s so committed to staying in character that they start improvising backstory details on the fly. Some of those details might accidentally align with the real script. Others are pure invention. The actor doesn’t know the difference, and neither does the model.

Where the Word Came From

“Hallucination” traces back to the Latin hallucinari, meaning “to wander in the mind.” It became a clinical term in the 1800s when French psychiatrist Jean-Etienne Dominique Esquirol used it to describe false sensory perceptions: seeing or hearing things with no external cause.

The word first appeared in computing earlier than most people realize. In 1982, British researcher John Irving Tait used it in a technical report on text summarization, describing a system that “sees in the incoming text a text which fits its expectations” rather than what was actually there. In 2000, computer vision researchers Simon Baker and Takeo Kanade published a paper titled “Hallucinating Faces” about systems that could fill in missing detail in degraded images by inventing plausible pixels. At that point, the word was actually a compliment. The model’s ability to generate convincing details from incomplete information was a feature, not a bug.

The meaning flipped in 2018. Google DeepMind researchers published “Hallucinations in Neural Machine Translation,” using the term to describe a failure mode: translation systems that produced fluent, confident output with no connection to the original text. As large language models went mainstream after ChatGPT’s launch in late 2022, the word followed them into everyday conversation. In 2023, the Cambridge Dictionary named “hallucinate” its Word of the Year.

There’s an ongoing academic debate about whether “hallucination” is even the right word. Some researchers argue “confabulation” is more accurate, a clinical term for when patients with memory disorders unintentionally fill gaps with fabricated narratives. The parallel is striking: the confabulating patient isn’t lying, they’re generating plausible-sounding information to bridge what they don’t actually know, which is precisely what an LLM does. Others point out that “hallucination” implies sensory experience and consciousness, neither of which an AI possesses. But “hallucination” stuck because it’s vivid and immediate, and language tends to favor the word people actually remember.

Why It Happens

The mechanics are simpler than they might seem. A large language model generates text by predicting the next token (roughly three-quarters of a word) based on everything that came before it. It’s not looking up facts in a database. It’s calculating probabilities: given this sequence of words, what word most likely comes next?

Most of the time, those probabilities produce accurate, useful text. But the model has no internal fact-checker, no way to verify whether the sequence it’s building corresponds to reality. If the statistical patterns in its training data nudge it toward a plausible-sounding falsehood, it follows that path with complete confidence. Ask about a real person and the model might blend details from several people because their contexts overlapped in the training data. Ask for a source and it might construct a citation that follows the exact format of a real academic paper but points to a work that was never written.

The problem compounds as the model generates. Each new word becomes part of the input for the next prediction. One wrong detail early on can steer the entire response deeper into coherent fiction, because the model is optimizing for local consistency (does this sentence follow logically from the last one?) rather than global truth (is any of this actually real?).

The training process itself makes things worse. During training, models see only examples of fluent language, never examples labeled “this is false.” Human reviewers tend to prefer long, detailed, confident-sounding answers, so the model learns that hedging and uncertainty are penalized while boldness is rewarded. The result is a system that would rather invent a plausible answer than admit it doesn’t know.

Why This Matters for Your Writing Life

Hallucinations show up in author workflows more often than you’d expect, and recognizing them makes you a sharper user of every AI tool you touch.

Fact-checking is non-negotiable. If you use ChatGPT or Claude to research historical details for a novel, verify everything independently. AI can summarize, suggest, and point you in the right direction, but it can also invent a historical event with the same confident tone it uses to describe a real one. The cautionary tale that keeps on giving: in 2023, attorney Steven Schwartz submitted legal briefs citing six entirely fabricated court cases generated by ChatGPT, complete with invented quotes and fake judicial reasoning. A federal judge fined his firm $5,000. The model hadn’t flagged any uncertainty because it has no mechanism to do so.

Your settings affect the risk. Temperature, the setting that controls how creative or unpredictable a model’s output is, directly influences hallucination rates. Higher temperature means the model is more willing to choose less probable words, which produces more surprising text but also more fabrication. When you’re brainstorming character names or plot twists, that tradeoff might be worth it. When you need factual accuracy for your author bio or a historical timeline, keep temperature low.

Better prompts mean fewer fabrications. Vague, open-ended prompts give the model more room to wander into invention. Specific, well-structured prompts with clear constraints anchor the output closer to useful territory. “Tell me about Victorian funeral customs” might return a mix of real and imagined details. “List three documented Victorian mourning customs practiced between 1860 and 1890” gives the model much less room to improvise.

The most important thing to understand is that AI writing tools are not oracles. They are extraordinarily capable pattern-matching systems that produce text based on statistical likelihood, not factual verification. That doesn’t make them less useful. It just means the author in the room, you, is still the one responsible for knowing what’s true.