The tools described in this module produce substantial value when used well. They also fail in specific, recognisable ways that can damage professional work if they go undetected. A practitioner who understands the failure patterns catches them before they cause problems. A practitioner who does not understand them tends to be surprised when the failures appear, and sometimes too late to prevent damage. This section covers five failure patterns every practitioner should recognise, with the verification disciplines that catch each one.
Hallucination
Hallucination is the tendency of AI systems (specifically large language models) to produce outputs that sound plausible and are factually wrong. Module 1.2 covered the mechanism. These systems produce output by predicting the next most likely token given the context, and they have no separate mechanism for verifying truth. If the most likely continuation is factually wrong but plausible-sounding, the system produces the plausible-wrong continuation, often in confident language indistinguishable from correct content.
Hallucination shows up in specific forms. The most notorious is fabricated citations. Large language models can produce confident citations to case law, academic papers, statutes, and other authoritative sources that do not exist. The fabricated citation looks like a real citation. It has a plaintiff and defendant, a reporter citation, a year, a court, and a holding. Checking it against the actual case reporter reveals that none of it exists. This failure pattern has produced sanctions against lawyers who filed briefs containing AI-hallucinated citations, and it is the best-documented example of why every specific factual claim in AI output requires verification.
Hallucination also shows up as invented statistics, fabricated quotations attributed to real people, imaginary historical events presented as fact, and made-up details about real organisations or individuals. The common thread is that the content is specific enough to be checkable and the AI has produced it by extrapolating from patterns rather than retrieving verified facts.
The verification discipline for hallucination is specific and disciplined. Every specific factual claim in AI output that the practitioner cannot personally vouch for should be verified against primary sources before the output is used externally. Every citation, every statistic, every attributed quotation, every specific historical or corporate detail. The verification is faster than it sounds once it becomes habit. A practitioner who has read the contract the AI summarised does not need to verify the AI's summary against the contract; the practitioner already knows what the contract says. A practitioner who has not read the primary source underlying a specific factual claim in an AI-generated document cannot safely rely on that claim.
Confident Imprecision
A related pattern is that AI systems produce outputs in a confident register regardless of the actual reliability of the underlying content. The model has no concept of its own uncertainty in a way that would reliably let it flag "I am less sure about this part." Even when the output is wrong, the language typically does not signal the weakness.
This matters because human communication typically carries implicit uncertainty signals. A human who is unsure about a fact tends to hedge, to qualify, to indicate what they know and what they do not know. An AI output that is ninety percent right and ten percent wrong often reads with the same confidence as an output that is fully correct, and the reader has no signal to indicate which parts to trust.
The implication is that confidence in AI output is not a reliable indicator of accuracy. A practitioner cannot tell by reading the output whether the AI was on solid ground or was extrapolating from weak patterns. They have to verify the specific claims that matter, regardless of how confident the output sounded. Experienced practitioners develop a mental habit of discounting confidence as a signal when evaluating AI output, which is a reversal of the habit that serves them well when evaluating human communication.
The verification discipline is to identify the claims in the output that would cause harm if wrong, and to check those claims specifically. Not every sentence needs verification. The sentences that carry the actual weight of the decision or deliverable do.
Context Limits
Module 1.2 covered context windows as a technical property of large language models. In practical use, context limits produce a specific failure pattern. As conversations grow, older information falls out of the window, and the AI starts producing output that does not reflect material that was established earlier in the conversation. The practitioner may not realise what has dropped from context, and the AI often does not signal that information has been lost. It continues producing output as if it still has access to the full conversation.
This failure pattern is particularly damaging in long working sessions where the practitioner has built up substantial context over many exchanges. A consultant who has spent an hour developing a nuanced view of a client situation through conversation with the AI may find that the AI's responses in the second hour reflect the more recent parts of the conversation more than the earlier parts. The nuanced view the practitioner built may not be fully present in the AI's working memory any longer.
The verification discipline is to reinstate critical context periodically in long sessions, either by explicitly restating the key working assumptions or by pasting a summary of the relevant earlier material back into the current prompt. A practitioner who notices that the AI is producing output that contradicts earlier work should suspect context loss and reinstate the earlier context rather than accepting the contradictory output as a revision.
The related discipline is to keep critical instructions in the current prompt rather than relying on the AI to remember them from earlier in the conversation. If the practitioner established a constraint at the start of a two-hour session, and that constraint matters for the output being produced now, the practitioner should include the constraint in the current prompt rather than assuming it is still active.
Stale Knowledge
Large language models are trained on data up to a specific date, called the training cutoff. Anything that happened after that date is not in the model's knowledge. A model trained with a cutoff in mid-2024 does not know about events, regulations, court decisions, market developments, or company changes that happened after mid-2024. The model will often answer questions about those events, and the answers will be speculative or fabricated because the model is extrapolating from patterns rather than drawing on actual knowledge.
Practitioners often assume models know more recent events than they actually do. A lawyer who asks a model about a recent regulatory change may get a confident but inaccurate response if the change happened after the training cutoff. A consultant who asks about recent market developments may get content that is three or six or twelve months out of date without the model indicating the limitation. The output may be fluent and plausible. It is still based on stale knowledge and may be materially wrong for current conditions.
The verification discipline is to know the model's approximate training cutoff and to treat any request that depends on recent information as requiring external verification. Most commercial AI tools disclose their training cutoff, or the practitioner can ask the model directly. Requests about recent events, recent regulatory changes, current market conditions, or specific details about named entities that may have changed recently should be treated as high-verification territory regardless of how confident the output sounds.
Retrieval-augmented generation tools (covered in Module 1.2) address this limitation when they are set up to retrieve current information. A tool that searches the web or retrieves from a regularly-updated knowledge base can access post-cutoff information. The practitioner should know which of the tools they use have retrieval capability and which rely purely on training data, because the verification discipline differs substantially between the two.
Pattern Over-Application
AI systems work by finding and extending patterns. This works well when the task is substantively pattern-matching work. It produces specific failures when the task requires exceptions to standard patterns, situation-specific judgment, or recognition of when the standard pattern does not apply.
A common manifestation is that AI outputs apply general frameworks to specific situations where the framework does not quite fit. A general legal framework for contract review may miss the specific provisions that matter for a particular client's industry. A general consulting framework for market analysis may miss the specific factors that matter in the particular regional market being analysed. The AI produces the framework-consistent answer because the framework is the pattern it has learned. The situation-specific answer may be different, and the practitioner is the one who knows the difference.
A related manifestation is that AI outputs tend toward the most common version of a task rather than the version the specific client needs. A consultant working with a specific mid-market firm gets AI output that looks like general consulting work for general clients, because general patterns dominate the training data. The specific adaptations that make the work fit the actual client require the practitioner's judgment, not the AI's pattern-matching.
The verification discipline is to check whether the AI output reflects the specific situation or the general pattern, and to ask specifically about the features of the situation that might require exceptions. A prompt that explicitly calls out the unusual features of the specific case ("note that this client operates in a regulated industry where the standard framework does not directly apply; identify where the specific regulatory requirements would modify the general approach") often produces output that is substantively better than one that asks for the general treatment.
The Verification Discipline in Practice
The five failure patterns share a common structural feature. The output can look correct while containing content that is wrong. The verification discipline has to be active rather than passive, because passive reading does not catch failures that look indistinguishable from successes.
A practical verification pattern uses three questions. Is this factually critical, meaning does the output contain claims that would cause harm if wrong? Is this legally or professionally sensitive, meaning does the output interpret rules or obligations where precision matters for compliance? Is this going to external stakeholders, meaning does the output leave the practitioner's direct control and reach clients, regulators, or public audiences?
If the answer to any of these questions is yes, the output warrants the full verification discipline. Specific factual claims are checked against primary sources. Legal or regulatory content is validated by someone with the relevant authority. External-facing material is reviewed for accuracy, tone, and consistency with existing positions before it leaves the internal environment.
If the answer to all three is no, the output can move faster with lighter verification. A personal brainstorming note, a rough internal idea sketch, or a first-draft exploration of a problem the practitioner will refine further before anyone else sees it can move on lighter verification because the cost of a small error is low and the practitioner retains the opportunity to catch and fix problems before they matter.
The practitioner owns the output regardless of who produced it. An AI-generated document that the practitioner forwards without adequate verification becomes the practitioner's problem when the errors surface. The verification discipline is what keeps this from happening, and it is what separates professional AI use from casual AI use. Stage 2 of this programme develops the verification discipline more systematically. This module provides the practical foundation that Stage 2 then builds on.
A Note on Confidentiality and Data
Beyond the failure-mode discipline, practitioners using AI tools should maintain basic habits around confidentiality. Public AI tools may store prompts, use them for model improvement, or retain logs in ways that are not under the practitioner's control. Information that should not leave the practitioner's organisation should not be pasted into public AI tools. Personal data about clients, customers, or employees belongs in AI tools only if the specific tool has been approved by the organisation's compliance function for that kind of data. High-risk documents (contracts, health records, internal financial records, legal opinions) belong in AI tools only when the tool has been specifically vetted for that use.
The working rule is that the practitioner should treat public AI tools the same way they would treat a large external mailing list. Information appropriate for that level of exposure can go in. Information that requires more protection should stay out, or should go into vetted internal tools that the organisation controls. When in doubt, the practitioner should ask their organisation for guidance rather than guessing. Most organisations have active work in progress on AI governance, and the practitioner who asks is typically helping shape policy rather than being a nuisance.