4.2

Hosted and Self-Hosted Deployment

12 min

The Deployment Decision and Why It Matters

The distinction between a closed source and an open source model, addressed in Section 2, concerns the intellectual property status of the model weights: whether they are retained by the developing organisation or released publicly. The distinction between hosted and self-hosted deployment concerns something different: where the model runs, on whose infrastructure, and under whose operational control. These two distinctions are related but not identical, and understanding them separately is necessary for making informed decisions about AI deployment in professional environments.

The deployment model is not a decision that most individual professionals make directly. In many organisations, the deployment model has already been determined by the technology team, procurement function, or senior leadership before individual professionals begin using AI tools. Nevertheless, understanding what deployment model is in use, and what its implications are, is part of the professional AI literacy that this programme seeks to build. A professional who understands the deployment context of the AI tools they use is better positioned to exercise appropriate judgment about what information to submit, what protections are in place, and what questions to raise with their organisation's technology or compliance team when the situation requires it.

The three deployment configurations addressed in this section are hosted deployment, self-hosted deployment, and the enterprise agreement arrangements that occupy a middle ground between them. Each represents a distinct combination of capability, control, cost, and compliance implication, and each is appropriate in different professional contexts.

Hosted Deployment: The Standard Configuration for Professional AI Use

Hosted deployment, sometimes described using the Software as a Service model, is the configuration under which the overwhelming majority of professionals currently access AI tools. In hosted deployment, the AI model runs on servers owned and operated by the AI provider. The professional user accesses the model through a web-based interface, a mobile application, or an application programming interface, depending on the nature of their use. The computational work of processing prompts and generating responses happens on the provider's infrastructure, and the results are returned to the user over the internet.

The defining characteristic of hosted deployment from the user's perspective is that the operational complexity of running the model is entirely managed by the provider. The professional does not need to provision servers, manage software dependencies, configure security controls, monitor system performance, or apply model updates. All of these functions are handled by the provider as part of the service. The professional pays for access, either through a subscription that provides a defined level of usage or through a consumption-based model in which costs are proportional to usage volume, and receives a service that is maintained, updated, and operated by an organisation with the expertise and infrastructure to do so at scale.

This operational simplicity is the primary advantage of hosted deployment, and it is a substantial one. Building the infrastructure to run a capable AI model reliably, securely, and at professional performance levels requires significant investment in hardware, software engineering, security architecture, and operational capacity. For the vast majority of professional services organisations, the cost and complexity of doing this independently exceeds the cost of accessing equivalent capability through a hosted service. Hosted deployment allows organisations of any size, including individual practitioners, to access frontier AI capability without the capital investment and technical capability that self-hosting requires.

Beyond the reduction in operational burden, hosted deployment offers specific advantages that are relevant to professional users.

Access to current model versions. Hosted services provide access to the current version of the model maintained by the provider. When the provider releases an improved version, users typically gain access automatically or with minimal transition effort. The professional does not need to manage model upgrades, assess the implications of version changes, or coordinate the migration of workflows from one model version to another. The service remains current without active management from the user.

Accessibility across devices and locations. A hosted AI service is accessible from any device with internet connectivity and the appropriate credentials. For professionals who work across multiple locations, travel frequently, or collaborate with colleagues in different offices or jurisdictions, this accessibility is practically significant. The AI capability available in the office is the same capability available in a client meeting room, a hotel room, or a remote working location, without requiring the installation of software or the configuration of access to a private network.

Predictable cost structures. Hosted deployment typically offers cost structures that are more predictable than the variable and often underestimated costs of self-hosted deployment. Subscription pricing provides a fixed monthly or annual cost for a defined level of capability. Consumption-based pricing allows costs to scale with usage, which is advantageous for organisations whose AI use is irregular or growing. Both models are more predictable, in the context of organisational budgeting, than the combination of capital expenditure on hardware and ongoing operational costs that self-hosting involves.

The primary consideration that professionals must apply to hosted deployment is the data handling implication of transmitting information to an external provider's infrastructure. When a prompt is submitted to a hosted AI service, the text of that prompt, and any documents attached to it, travel from the user's device to the provider's servers over the internet. What happens to that information once it reaches the provider's infrastructure depends on the applicable terms of service. Whether the data is logged, retained, used for model training, or accessible to the provider's staff for quality assurance or safety review varies between providers, between service tiers, and between standard commercial terms and enterprise agreements. The professional obligation to understand these terms before submitting sensitive information, addressed in detail in Section 6 of Module 4.1, is directly relevant here.

Self-Hosted Deployment: Control, Capability, and Cost

Self-hosted deployment is the configuration in which the AI model runs on infrastructure owned, operated, and controlled by the deploying organisation rather than by an AI provider. The organisation downloads the model, provisions the required hardware or cloud compute environment, configures the operational infrastructure, and manages all aspects of the model's operation. From the perspective of data flow, nothing leaves the organisation's controlled environment: the prompt is submitted to a model running on the organisation's own servers, processed there, and the response is returned without the data passing through any external system.

Self-hosted deployment is only possible with open source models or, in specific cases, with closed source models under specialised enterprise licensing arrangements that permit on-premises deployment. For the major closed source models addressed in Section 3, self-hosting through standard commercial terms is not available. Claude, GPT-4o, and Gemini are accessible only through their respective providers' hosted infrastructure under standard and enterprise commercial agreements. Grok's deployment options are similarly constrained to the provider's infrastructure for most professional users. Self-hosting, in practice, means deploying an open source model.

The fundamental advantage of self-hosted deployment is data sovereignty: the assurance that sensitive information submitted to the AI model never leaves the organisation's controlled infrastructure. For organisations operating under regulatory requirements that restrict the transfer of certain categories of data outside a defined perimeter, this is not a preference but a compliance requirement. Healthcare organisations subject to regulations governing the processing of patient data, financial institutions subject to requirements governing the handling of client financial information, legal practices operating under professional confidentiality obligations, and any organisation operating under data residency requirements that specify where data must be processed and stored may find that self-hosted deployment is the only configuration that satisfies their regulatory obligations for certain categories of use.

Beyond data sovereignty, self-hosted deployment offers the possibility of deep customisation through fine-tuning, described in Section 2 of this module, and the ability to configure the model's behaviour, safety parameters, and integration with internal systems in ways that a hosted service does not permit. An organisation that has fine-tuned an open source model on its proprietary data, deployed it on its own infrastructure, and integrated it with its internal systems has built an AI capability that is specifically adapted to its context in ways that no hosted commercial service can replicate.

The costs and challenges of self-hosted deployment are substantial and should be assessed with care before the decision is made to pursue this path.

Computational infrastructure. Running a capable large language model requires significant computational resources. The hardware requirements depend on the size of the model, the volume of requests it will handle, and the response time performance that the organisation requires. For models in the range that would be considered professionally capable, meaning models with tens of billions of parameters or more, the hardware required typically consists of multiple high-end graphics processing units with large memory capacity. These units are expensive to purchase, consume substantial power, require cooling infrastructure, and have a limited operational lifespan. For organisations that prefer not to invest in physical hardware, cloud computing services offer access to equivalent computational capacity on a variable-cost basis, but the ongoing cloud compute costs for a capable model serving professional-volume requests can be substantial.

Engineering and operational expertise. Deploying an open source model in a professional environment is a significant engineering undertaking. It requires expertise in machine learning infrastructure, containerisation and orchestration technologies, network security architecture, model serving frameworks, performance monitoring, and incident response. These skills are not widely distributed within professional services organisations. Acquiring them through hiring requires competing for engineering talent in a market where AI engineering skills are in high demand. Acquiring them through contracting introduces a dependency on external parties that may itself create data handling considerations. The ongoing operational capacity to maintain the deployment, apply security patches, monitor for performance issues, and respond to failures must be sustained as a continuous commitment rather than treated as a one-time project.

Model governance and safety responsibility. When an organisation deploys a hosted closed source model, the provider retains responsibility for the model's safety properties, the alignment work that governs its behaviour, and the compliance of its operation with applicable AI regulations. When an organisation deploys a self-hosted open source model, it assumes full responsibility for all of these properties. Under the European Union's Artificial Intelligence Act, this responsibility is not abstract. Organisations deploying AI systems in high-risk applications are required to conduct conformity assessments, maintain technical documentation, implement human oversight mechanisms, and register their systems with the relevant authorities. The compliance burden associated with self-hosted deployment of AI in regulated professional contexts is therefore substantially greater than the compliance burden associated with using a hosted service from a provider that has invested in AI Act compliance as part of its commercial offering.

Enterprise Agreements: The Structured Middle Ground

The distinction between hosted and self-hosted deployment can suggest a binary choice between the convenience and accessibility of hosted services on standard commercial terms and the control and data sovereignty of self-hosted deployment with its associated costs and complexity. In practice, the major closed source AI providers offer a range of enterprise agreement structures that provide significantly stronger data protection, operational control, and compliance assurance than standard commercial terms, while retaining the operational simplicity of hosted deployment.

Enterprise agreements are negotiated contracts between the AI provider and the deploying organisation that supplement or replace the standard terms of service applicable to commercial subscriptions. The specific provisions available through enterprise agreements vary between providers and depend on the scale and nature of the deployment being negotiated, but the following categories of provision are commonly available.

Data processing agreements. A data processing agreement is a contract that establishes the terms under which the AI provider processes personal data on behalf of the deploying organisation. Under the General Data Protection Regulation, such an agreement is required before personal data may be lawfully submitted to an external processor. A compliant data processing agreement specifies the purposes for which the data may be processed, the technical and organisational security measures the processor applies, the conditions under which sub-processors may be engaged, the process for responding to data subject rights requests, and the arrangements for data return or deletion at the end of the agreement. Organisations operating under GDPR that wish to use AI tools to process personal data should confirm that a compliant data processing agreement is in place before doing so.

Exclusion from model training. Standard commercial terms for many AI products include provisions that permit the provider to use submitted data to improve their models. For professional users submitting client information, confidential analysis, or proprietary content, this provision may be unacceptable. Enterprise agreements typically include an explicit commitment that data submitted under the agreement will not be used for model training or improvement purposes. This commitment is among the most important provisions to verify before submitting sensitive professional information to any hosted AI service.

Data residency guarantees. For organisations subject to requirements specifying where data must be processed and stored, enterprise agreements with major providers often include options for data processing to be confined to infrastructure located within a specified geographic region. In the European context, this typically means infrastructure located within the European Union or the European Economic Area. Data residency guarantees do not provide the same degree of assurance as self-hosted deployment on the organisation's own infrastructure, but they significantly reduce the data sovereignty concerns associated with standard hosted deployment by ensuring that data remains within a defined regulatory jurisdiction.

Single-tenant infrastructure options. Standard hosted deployment typically involves the organisation's data being processed on shared infrastructure alongside data from other customers, with logical separation rather than physical isolation. For organisations handling particularly sensitive information, some providers offer single-tenant configurations in which the processing infrastructure is dedicated to a single customer. This provides a higher degree of physical isolation than standard shared-infrastructure deployment, though the model itself remains hosted and operated by the provider.

Audit logging and compliance support. Enterprise agreements typically include provisions for enhanced logging of system activity, which supports the internal audit and compliance functions that regulated organisations are required to maintain. The ability to produce records of what was submitted to the AI system, when, by whom, and what responses were generated is relevant both for internal governance purposes and for demonstrating compliance with applicable regulations in the event of a regulatory inquiry.

Selecting the Appropriate Deployment Configuration

The choice between hosted deployment under standard terms, hosted deployment under an enterprise agreement, and self-hosted deployment should be made on the basis of a clear assessment of the organisation's regulatory obligations, its data handling requirements, its technical capability, and the specific AI use cases it intends to support.

For individual professionals and small practices beginning to integrate AI into their work, hosted deployment under the standard commercial terms of a reputable provider is the appropriate starting point. The operational simplicity, accessibility, and continuous improvement of hosted services align well with the needs of organisations that are still developing their AI practice and do not yet have the scale or the technical infrastructure to justify more complex deployment arrangements.

For organisations that have reached a scale of AI use at which the data handling implications of standard commercial terms have become a compliance concern, or that are beginning to use AI in contexts involving personal data or other regulated information categories, the appropriate step is to engage with the relevant providers about enterprise agreement terms. The provisions available through enterprise agreements, particularly the data processing agreement and the exclusion from model training, address the most significant data handling concerns associated with hosted deployment while preserving the operational advantages that make hosted services attractive.

For organisations with specific data residency requirements that cannot be satisfied through enterprise agreement provisions, or with the scale and technical capability to justify the investment in self-hosted deployment, the open source model route described in Section 2 of this module warrants serious evaluation. This is a significant operational and governance commitment, and the decision to pursue it should be made with a clear understanding of the total cost of deployment, the regulatory obligations it creates, and the ongoing investment required to maintain it at professional standards.