Deployment Modes
One Operating Layer.
Every Deployment Mode.
BrainPack runs the same AI operating layer across six deployment modes — from frontier models in the public cloud to fully air-gapped inference. Pick the mode that fits each workload; the layer stays the same.
One layer. Many trust boundaries.
BrainPack treats deployment mode as a routing decision, not an architecture decision. The operating layer, agent library, and governance policy stay constant. Only where inference executes changes — per workload, per data class.
Six Modes. One Governance Policy.
Each mode is a full BrainPack deployment — same operating layer, same agents, different trust boundary. Open any mode for the detailed when-right / when-wrong breakdown.
Frontier models — Claude, GPT, Gemini — through governed API endpoints. Full audit, routing, and policy controls. Ships in days, not quarters.
The same frontier models on enterprise no-retention contracts. Model quality without the audit-trail exposure.
Open-source LLMs — Llama, Mistral, Qwen, DeepSeek — on dedicated GPU. No third-party AI provider in the data path.
AI inside your own data center. Your hardware, your network, your audit perimeter. Operated by BrainPack.
Zero internet, full isolation for classified, sovereign, and critical-infrastructure workloads. Updates via controlled physical media.
Data, inference, and operator inside one national jurisdiction. EU AI Act, SecNumCloud, and C5 frameworks respected.
Frequently Asked Questions
Yes. Most BrainPack deployments mix several modes. The Govern layer routes each workload to the appropriate mode by data classification — public cloud for general productivity, ZDR for customer data, self-hosted for IP-sensitive analysis, on-premise for regulated workloads, air-gapped for classified operations — under one policy.
You do not choose per request — the layer does. You define data classes and regulatory constraints once; BrainPack maps each class to a mode and enforces it automatically. An architect walks through your data classes with you before anything ships.
No. BPU is capacity-based, not time-based. Your organization allocates BPUs based on execution capacity needed. There are no hourly rates, no separate charges for support, customization, training, or development. All work is covered within the allocated BPU.
No. Workloads can move between modes as requirements change. Because the operating layer and agent library are mode-independent, switching where inference runs does not mean rebuilding what runs.
No separate inference bill per mode. BPU capacity funds the platform, team, integrations, agents, governance, and inference across every deployment mode.
Public cloud — there is no infrastructure stand-up, so first capabilities ship within 1–2 weeks. Self-hosted, on-premise, air-gapped, and sovereign take longer to provision but follow the same operating model.
Map your workloads to the right modes.
Bring your data classes and regulatory constraints. An architect will show you which workloads belong on which mode — and how the routing policy holds across all of them.