Deployment Modes

One Operating Layer.
Every Deployment Mode.

BrainPack runs the same AI operating layer across six deployment modes — from frontier models in the public cloud to fully air-gapped inference. Pick the mode that fits each workload; the layer stays the same.

Talk to an Architect

Why modes matter

One layer. Many trust boundaries.

BrainPack treats deployment mode as a routing decision, not an architecture decision. The operating layer, agent library, and governance policy stay constant. Only where inference executes changes — per workload, per data class.

Deployment modes

Six Modes. One Governance Policy.

Each mode is a full BrainPack deployment — same operating layer, same agents, different trust boundary. Open any mode for the detailed when-right / when-wrong breakdown.

Public Cloud

Frontier models — Claude, GPT, Gemini — through governed API endpoints. Full audit, routing, and policy controls. Ships in days, not quarters.

Zero Data Retention(ZDR)

The same frontier models on enterprise no-retention contracts. Model quality without the audit-trail exposure.

Self Hosted

Open-source LLMs — Llama, Mistral, Qwen, DeepSeek — on dedicated GPU. No third-party AI provider in the data path.

On Premise

AI inside your own data center. Your hardware, your network, your audit perimeter. Operated by BrainPack.

Air-Gapped

Zero internet, full isolation for classified, sovereign, and critical-infrastructure workloads. Updates via controlled physical media.

Sovereign

Data, inference, and operator inside one national jurisdiction. EU AI Act, SecNumCloud, and C5 frameworks respected.

FAQ

Frequently Asked Questions

Can we run more than one deployment mode at once?

Yes. Most BrainPack deployments mix several modes. The Govern layer routes each workload to the appropriate mode by data classification — public cloud for general productivity, ZDR for customer data, self-hosted for IP-sensitive analysis, on-premise for regulated workloads, air-gapped for classified operations — under one policy.

How do we choose the right mode?

You do not choose per request — the layer does. You define data classes and regulatory constraints once; BrainPack maps each class to a mode and enforces it automatically. An architect walks through your data classes with you before anything ships.

Is BPU billed by the hour?

No. BPU is capacity-based, not time-based. Your organization allocates BPUs based on execution capacity needed. There are no hourly rates, no separate charges for support, customization, training, or development. All work is covered within the allocated BPU.

Is the deployment mode locked in?

No. Workloads can move between modes as requirements change. Because the operating layer and agent library are mode-independent, switching where inference runs does not mean rebuilding what runs.

Does mode choice change pricing?

No separate inference bill per mode. BPU capacity funds the platform, team, integrations, agents, governance, and inference across every deployment mode.

Which mode ships fastest?

Public cloud — there is no infrastructure stand-up, so first capabilities ship within 1–2 weeks. Self-hosted, on-premise, air-gapped, and sovereign take longer to provision but follow the same operating model.

Map your workloads to the right modes.

Bring your data classes and regulatory constraints. An architect will show you which workloads belong on which mode — and how the routing policy holds across all of them.

Deploy Your First BPU Talk to BrainPack