3.3

The Technical Roots of Hallucination

35 min

Stage 1 treated hallucinations as a practical risk and introduced a behavioural rule: do not automatically trust fluent outputs. Stage 3 goes deeper. It explains why hallucinations occur in mechanical terms, so that teams can design workflows that prevent them, detect them, and manage their impact.

A hallucination is not a rare anomaly. It is a predictable failure mode that follows from how large language models generate text. When a model is asked to respond and the necessary evidence is missing or unclear, it will often generate an answer that is linguistically plausible and stylistically convincing. Without explicit grounding, the model can produce content that looks correct but is not supported by the organisation’s sources.

This section introduces the core mechanisms behind that behaviour and then translates those mechanisms into design principles for enterprise workflows.

A. How Language Models Generate Text

A.1 The Next-Token Prediction Mechanism

Large language models generate text by predicting the next token given the tokens that came before it. This is the fundamental operation of modern language models. A token is a unit of text processing, often representing a word, part of a word, or a symbol.

Informally, the model answers the question:

“Given the text so far, what is the most likely next piece of text?”

The simplified probability form is:

P(wn∣w1,w2,…,wn−1)P(w_n \mid w_1, w_2, \dots, w_{n-1})P(wn​∣w1​,w2​,…,wn−1​)

This means the model estimates the probability of the next token wnw_nwn​ given the preceding sequence w1,w2,…,wn−1w_1, w_2, \dots, w_{n-1}w1​,w2​,…,wn−1​.

The model repeats this process token by token until the response is complete.

A.2 What the Model Learns During Training

During training, the model learns patterns that connect:

  • words to words

  • phrases to phrases

  • structures to structures

  • instructions to typical responses

  • domain language to typical document formats

  • cause and effect descriptions to plausible conclusions

These patterns are learned across a large amount of text. The result is a system that is very good at producing coherent language and at matching common structures.

For example, the model learns what a policy document usually looks like, what a compliance summary sounds like, and what legal language patterns are common. This capability is valuable because it allows fast drafting, summarisation, and synthesis.

However, pattern mastery is not the same as truth verification.

A.3 Why This Is Not Fact-Checking

A key limitation is that the model does not maintain a dedicated internal module that verifies factual correctness in the way a human fact-checker would. It does not inherently consult an external database. It does not inherently validate references. It does not inherently check whether an internal company policy exists unless the system explicitly provides that policy or retrieves it.

The model produces text that is statistically consistent with:

  • the provided context in the current request

  • the instruction given by the user

  • the patterns learned during training

If the context includes evidence, the model can use it. If the context does not include evidence, the model may still generate a response, because the objective of next-token prediction rewards continuation, coherence, and alignment with prompt intent.

A.4 The Professional Consequence

From an enterprise perspective, this means:

  • fluency is not proof

  • confidence is not evidence

  • a well-written answer can still be unsupported

This is why Stage 3 emphasises grounding and traceability.

B. Why Hallucinations Happen

Hallucinations occur when the model produces content that appears plausible but is not supported by the available evidence in the current context. This happens through a combination of prompt pressure, missing information, and learned patterns.

B.1 Prompt Pressure: Questions That Imply an Answer Exists

Many user prompts are phrased in a way that assumes an answer exists. For example:

  • “What does our policy say about X?”

  • “Summarise the clause that defines Y.”

  • “List the steps in our procedure for Z.”

These prompts imply that the relevant policy, clause, or procedure exists and is accessible. If the system cannot retrieve the relevant source, the model is placed in a difficult position. It has been asked to produce a structured answer in a style that it has learned well.

In practice, this increases the risk that it will generate a plausible answer even when the evidence is missing.

B.2 Missing Grounding: Insufficient Evidence in Context

Hallucination risk increases sharply when the model does not have grounded information in the context window.

Common causes include:

  • the relevant document was never uploaded

  • the document exists but was not retrieved for this request

  • the wrong version of a policy was retrieved

  • the retrieved excerpt does not contain the needed clause

  • the task is too broad and the system retrieved insufficient context

  • the conversation has exceeded context limits and earlier evidence was dropped

In each case, the model does not have the material needed to answer responsibly. Yet it has been asked to produce an answer.

B.3 Pattern Completion Under Uncertainty

When the model lacks evidence, it may complete the pattern anyway.

Example scenario:

A user asks, “What is our company’s policy on retention of customer records?”

If the organisation has not provided a retention policy in the Encyclopedia and no retrieval occurs, the model may still generate a retention policy. It might include typical phrases such as:

  • “Data will be retained only as long as necessary”

  • “Access is restricted based on role”

  • “Records are deleted after a defined retention period”

  • “Compliance with applicable regulations is required”

These statements sound professional because they match common policy language patterns. The output is coherent and plausible. Yet it may not reflect the organisation’s actual policy.

This is pattern completion under uncertainty.

B.4 The Core Risk for Workflow Design

The core risk is not intent. The model is not choosing to mislead. The risk is a mismatch between:

  • the format the user expects

  • the evidence the model has

  • the model’s strong ability to generate plausible language

When evidence is missing, the model can still produce confident text. That is why ungrounded use is high risk in professional environments.

C. Hallucination as Statistical Overreach

C.1 A Professional Definition

A useful professional definition is:

Hallucination is statistical overreach. It is the production of content that extends beyond the available evidence, driven by the model’s objective to generate coherent, high-probability text.

This definition matters because it shifts the response from blame to design. If hallucinations are a predictable result of the mechanism, then reliability must be achieved through workflow controls.

C.2 Why Coherence Is Easier Than Uncertainty

Language models are optimised to produce coherent continuations. Saying “I do not know” is possible, yet it often competes with a strong learned pattern: when asked a question, produce an answer.

In many cases, the model will produce a complete-looking response because:

  • it has strong priors for common document structures

  • it has learned that users prefer direct answers

  • it receives insufficient signals that uncertainty is allowed

  • it may not be penalised for inventing content in ordinary usage

This is why Stage 3 emphasises designing systems that explicitly reward uncertainty when evidence is absent.

C.3 The Design Solution: Evidence First Workflows

The solution is not to hope hallucinations disappear through model scaling or better phrasing. The solution is to build evidence-first workflows.

Evidence-first workflows include:

  1. Grounding through retrieval
    Retrieve relevant sections of policies, SOPs, contracts, and approved sources into the context window.

  2. Explicit source constraints
    Instruct the agent to answer only using the provided sources and to state when sources do not contain the required information.

  3. Mandatory citations for high-stakes work
    Require references to specific clauses, pages, or encyclopedia entries for compliance, legal, finance, and policy-driven tasks.

  4. Structured uncertainty handling
    Require the model to produce one of three outputs:

    • an answer with sources

    • a request for missing information

    • a clearly labelled uncertainty statement with recommended next steps

  5. Human review at decision points
    For high-stakes outputs, require review and sign-off, supported by traceability and source links.

These controls do not remove the model’s generative nature. They reduce the circumstances under which statistical overreach becomes likely, and they make it visible when it occurs.