Why AI Tools Create a Distinct Privacy Risk Profile
Every organisation that handles professional information has existing frameworks for data security. Access controls, encryption standards, document retention policies, and confidentiality obligations are familiar features of professional practice. These frameworks were designed for a specific threat model: the risk that information might be accessed, stolen, or disclosed without authorisation by an external actor or a negligent insider.
AI tools introduce a different kind of risk, and it is important to understand how it differs from the risks that existing security frameworks were designed to address. When a professional uses an AI tool to assist with their work, they are not simply accessing information. They are transmitting information to an external processing environment. The document uploaded to an AI tool, the text pasted into a prompt, the query that contains client-specific details: all of these represent outbound transfers of information from the professional's controlled environment to the AI provider's infrastructure. What happens to that information once it has been transmitted, how it is stored, whether it is used to train future models, who within the AI provider's organisation can access it, and under what circumstances it might be disclosed to third parties are questions governed entirely by the AI provider's terms of service and data processing agreements.
This matters because the most capable AI tools currently available are, in most of their standard commercial configurations, operated by third-party organisations under terms that were not designed with the specific confidentiality obligations of legal, insurance, financial, or consulting professionals in mind. A professional who uploads a privileged legal memorandum to a consumer-tier AI product has transmitted that document outside their organisation's controlled environment. A claims analyst who pastes a policyholder's medical history into a prompt has sent protected health information to a third-party processor under terms that may not satisfy applicable regulatory requirements. A financial analyst who describes an unreleased earnings result in a query has transmitted material non-public information to an external system.
None of these actions require malicious intent. They can occur through ordinary professional use of AI tools in the absence of a clear framework for what information may and may not be transmitted. The purpose of this section is to establish that framework.
What the Transmission of Data to AI Tools Actually Involves
Before examining the categories of information that require protection, it is useful to understand what happens technically when information is submitted to a cloud-based AI tool, because the behaviour of these systems is not always well understood and misunderstandings about it lead to under-protection of sensitive information.
When a professional submits a prompt to a cloud-based AI tool, the text of that prompt and any documents attached to it are transmitted over the internet to the AI provider's servers. The provider's systems process the submission, generate a response, and return that response to the user. At various points in this process, the submitted information may be logged for operational purposes, reviewed by the provider's staff for safety or quality assurance, used to improve the provider's models, or retained in the provider's systems for a period defined by their data retention policy.
The specific data handling practices of each provider are set out in their terms of service and, where applicable, their data processing agreements. These terms vary significantly between providers, between tiers of service within a single provider, and between the standard consumer product and enterprise agreements negotiated separately. A consumer-tier subscription to an AI product typically provides fewer data protection guarantees than an enterprise agreement, and an enterprise agreement negotiated with explicit data handling provisions provides stronger protections than a standard enterprise subscription.
The professional using an AI tool is responsible for understanding which tier of service they are operating under and what data handling terms apply to that tier before submitting any information to the tool. The assumption that AI tools handle data confidentially by default is not a safe assumption. The obligation to verify the applicable terms before submitting sensitive information rests with the professional, not with the AI provider.
Categories of Information That Must Never Appear in AI-Accessible Files or Prompts
Certain categories of information should be treated as absolutely excluded from AI-accessible files and from the content of AI prompts, regardless of the AI provider, the tier of service, or the security assurances provided. These are categories where the potential consequences of exposure are severe enough that no degree of provider assurance eliminates the risk to an acceptable level.
Authentication credentials and access control information. Passwords, cryptographic keys, API tokens, session tokens, and any other information that grants access to systems, accounts, or data should never appear in documents stored in AI-accessible folders or in the text of AI prompts. This prohibition is absolute. If a context document or a file to be processed happens to contain authentication credentials, those credentials must be removed before the document is made available to any AI tool. The reason is structural: AI systems process text without applying judgment about whether the text represents sensitive operational data. An AI tool that reads a document containing a password treats that password as content, not as a security-critical credential, and may repeat it in outputs, store it in logs, or transmit it to systems with no understanding of what it represents.
Government identification numbers and personal identity data. Social security numbers, national insurance numbers, tax identification numbers, passport numbers, and the equivalent identifiers used across European jurisdictions are subject to strict data protection requirements under the General Data Protection Regulation and equivalent national legislation. These identifiers, in combination with a name, are sufficient to enable identity fraud and are among the most sensitive categories of personal data recognised in law. They must not appear in AI-accessible files or prompts under any circumstances that have not been specifically authorised through a compliant data processing agreement.
Payment and financial account data. Full payment card numbers, bank account numbers, sort codes, routing numbers, and equivalent financial identifiers are subject to specific regulatory frameworks, including the Payment Card Industry Data Security Standard, that impose strict controls on how they may be stored, processed, and transmitted. Cloud-based AI tools are not generally certified under these frameworks, and transmitting this category of data to an AI tool without specific regulatory clearance creates compliance liability that most organisations will not accept.
Unencrypted regulated personal data. European data protection law, through the General Data Protection Regulation, establishes specific requirements for the lawful processing of personal data. When personal data is submitted to an AI tool, the AI provider becomes a data processor under the GDPR framework, and the professional's organisation becomes responsible for ensuring that this processing is covered by a compliant data processing agreement and has a lawful basis under the Regulation. In the absence of such an agreement, the transmission of personal data to an AI tool may constitute unlawful processing. The implications of this are addressed in greater detail in the role-specific section below.
A Three-Tier Framework for Information Sensitivity
Beyond the absolute exclusions described above, professional information exists on a spectrum of sensitivity. A practical framework for navigating this spectrum divides information into three tiers, each of which implies a different level of caution in AI use.
The first tier: public information. Public information is information that the organisation has deliberately placed in the public domain or would not object to appearing there. Published reports, press releases, publicly available financial statements, product descriptions, biographical information about senior executives that has been published with their consent, and publicly available regulatory filings all fall within this category. AI tools may be used freely with this category of information, and there is no restriction on submitting it to standard commercial AI tools regardless of the tier of service.
The second tier: internal information. Internal information is information that is used within the organisation in the conduct of its work but that has not been placed in the public domain and would not be placed there voluntarily. Operational procedures, internal strategic documents, team communications, internal financial analyses, and project planning documents typically fall within this category. AI tools may be used with this category of information when the AI provider's standard commercial terms include adequate data handling provisions. This means, at a minimum, that the provider does not use submitted data to train its models and that the data is not accessible to the provider's staff except under defined circumstances. Professionals should verify that these provisions are in place before submitting internal information to any AI tool and should not assume their existence without checking.
The third tier: confidential information. Confidential information is information subject to specific legal or contractual obligations of confidentiality, or information whose disclosure would cause material harm to the organisation, its clients, or the individuals to whom it relates. Attorney-client privileged communications, protected health information, material non-public financial information, trade secrets, personally identifiable information about individuals in the scope of data protection regulation, and information subject to specific contractual confidentiality obligations all fall within this category. AI tools may be used with this category of information only under the terms of a specifically negotiated enterprise agreement that includes compliant data processing provisions appropriate to the category of information involved. Standard commercial terms, even at enterprise subscription tier, are unlikely to satisfy the requirements for this category without specific negotiation.
Role-Specific Compliance Obligations
The three-tier framework provides a general structure, but specific professional roles carry specific legal and regulatory obligations that require additional attention. The following addresses the most significant compliance considerations for the professional roles covered in this programme.
Legal Professionals
Legal professionals in most jurisdictions operate under obligations of confidentiality that are among the most stringent in any profession. In England and Wales, and across most European jurisdictions, the duty of confidentiality owed to clients is not merely a professional courtesy. It is a legal obligation enforced by regulatory bodies and, in many cases, by statute.
Attorney-client privilege and legal professional privilege, which protect confidential communications between lawyers and their clients from disclosure, are particularly significant in the context of AI tools. Privilege is a property of the communication, not simply of the document that contains it. Submitting privileged communication to a third-party AI tool risks waiving the privilege by voluntarily disclosing the communication to a party outside the professional relationship. Whether this waiver actually occurs in a specific jurisdiction, under specific circumstances, and under a specific AI provider's terms of service is a question of law that has not been definitively resolved in most European jurisdictions. In the absence of clear legal guidance, the conservative position, which most professional indemnity insurers and regulatory bodies would support, is to treat privileged communications as excluded from AI tools operating under standard commercial terms.
The work product doctrine, where it applies, provides a parallel protection for documents prepared in anticipation of litigation that may also be affected by disclosure to third-party AI tools.
Insurance Professionals
Insurance professionals handle large volumes of personal data relating to policyholders, claimants, and third parties involved in insured events. Under the General Data Protection Regulation, this data is personal data subject to the full requirements of the Regulation, including lawful basis for processing, data subject rights, data minimisation obligations, and the requirement for a compliant data processing agreement with any third-party processor.
Where the personal data relates to health, the obligations are more stringent still. Health data is classified as a special category of personal data under Article 9 of the GDPR, and its processing is subject to additional restrictions that require either the explicit consent of the data subject or reliance on one of a limited number of other specified legal bases. Medical records, clinical reports, health assessments, and any other documentation that reveals information about the physical or mental health of an identified or identifiable individual are special category data and must be handled accordingly.
Insurance professionals using AI tools to assist with claims processing must ensure that any AI tool involved in processing personal data, including special category health data, is covered by a compliant data processing agreement that satisfies GDPR requirements. The use of AI tools that lack this agreement for processing personal claims data represents a potential regulatory breach with significant consequences under European data protection law.
Financial Professionals
Financial professionals face a distinct category of regulatory obligation relating to material non-public information. Material non-public information is information about a publicly listed company that has not been made available to the market and that would, if disclosed, be likely to affect the price of the company's securities. Trading on the basis of material non-public information constitutes market abuse under European Market Abuse Regulation and equivalent legislation in other jurisdictions.
The relevance of this obligation to AI tool use is direct. A financial analyst who submits a prompt to an AI tool that contains material non-public information about a company's unreleased financial results, a proposed transaction, or a significant strategic development has transmitted that information to a third party. Whether this transmission constitutes unlawful disclosure under applicable market abuse rules depends on the specific circumstances, the jurisdiction, and the terms of the AI provider's data handling agreement. The conservative and professionally responsible position is that material non-public information must never be submitted to any AI tool operating under standard commercial terms, and that any AI tool used in contexts where material non-public information is involved must be specifically cleared for that use by the organisation's compliance function.
All Professionals
Across all professional roles, trade secrets and competitively sensitive business information warrant treatment as confidential information under the three-tier framework. The business strategies, client lists, proprietary methodologies, pricing structures, product development plans, and other information that gives an organisation its competitive position represent assets of significant commercial value. Their disclosure, whether intentional or inadvertent, can cause material harm to the organisation that holds them.
The applicable legal framework in Europe is Directive 2016/943 on the protection of undisclosed know-how and business information, which has been implemented in national law across European Union member states. This Directive provides legal protection for trade secrets subject to reasonable steps being taken to keep them confidential. The use of AI tools under terms that do not provide adequate data protection for this category of information may be inconsistent with the reasonable steps requirement, potentially affecting the organisation's ability to enforce trade secret protection.
A Practical Decision Framework
The regulatory and legal landscape described in this section is complex, and professionals working under time pressure need a practical way to apply it without requiring legal analysis for every document they consider submitting to an AI tool. The following framework provides a structured approach to this decision that is applicable across roles and jurisdictions.
The first question to ask about any document or information before submitting it to an AI tool is which tier of the sensitivity framework it belongs to. Public information requires no further analysis. Internal information requires confirmation that the AI provider's data handling terms are adequate. Confidential information requires enterprise agreement and specific compliance review.
Where tier classification is uncertain, the second question is the disclosure test: if the content of this document appeared in a regulatory inquiry, a court proceeding, or a journalistic investigation, would the organisation face legal liability, regulatory sanction, reputational damage, or harm to the individuals whose information it contains? If the answer is yes, or even if the answer is uncertain, the document should be treated as confidential and handled accordingly.
Where the disclosure test is passed but uncertainty remains about the specific regulatory implications, the appropriate action is to seek guidance from the organisation's legal counsel, data protection officer, or compliance function before proceeding. This is not a counsel of excessive caution. It is the responsible exercise of professional judgment in a regulatory environment that is still developing its position on AI tool use. The cost of seeking guidance before submitting sensitive information is small. The cost of regulatory breach, privilege waiver, or trade secret exposure is potentially severe.
The fundamental principle underlying this entire section is that the professional using an AI tool is responsible for the information they submit to it. The AI provider's terms of service define what the provider will do with that information. They do not define, limit, or replace the professional's own legal and ethical obligations regarding the information in their care. The responsibility for ensuring that those obligations are met rests with the professional, and the framework in this section is designed to support that responsibility rather than to substitute for it.
You have reached the end of this module, and you now have a clear and practical understanding of the foundational layer that makes AI assistance genuinely useful in professional work.
In this module, we guided you through the relationship between information organisation and AI capability. You learned why AI tools produce generic outputs when given no context, and why the solution to that problem lies not in the technology itself but in how you prepare and structure the information the technology can access. You explored the two distinct failure modes that undermine AI usefulness in practice: information that cannot be found because it has been poorly named or badly located, and information that the AI has no means of accessing because it was never written down at all.
From that foundation, you worked through the four structural layers of a personal knowledge base. You now understand how to construct file names that communicate identity, ownership, document type, and chronology in a form that both human colleagues and AI tools can interpret reliably. You understand how folder structures create contextual relationships between documents, why shallow hierarchies outperform deep ones, and how the choice of organising principle at the top level determines the coherence of everything beneath it.
You have also examined the layer of the knowledge base that most professionals neglect: the written context that turns a collection of well-named documents into an intelligible picture of the work. You understand what README documents are for, what the four types of context document capture and why each is necessary, and why proximity matters when deciding where context documents should be stored.
You have seen that a knowledge base requires active maintenance to remain useful, and you have a concrete weekly practice through which that maintenance can be sustained with minimal effort. You also have a clear framework for the privacy and security boundaries that govern which categories of professional information may be made available to AI tools under standard commercial terms, which require enterprise-grade agreements, and which must be excluded entirely.
From here, we move to the question of which AI tools are best suited to the work you have now prepared your knowledge base to support. Module 4.2 addresses the AI model landscape: how to understand the difference between the major models available to professional users, how to evaluate their respective strengths against specific task requirements, and how to make informed decisions about tool selection rather than defaulting to familiarity or convenience.