Skip to content

Deployment#

Four supported deployment modes, ordered by typical maturity:

  • Docker Compose — single-node dev / laptop / tiny PoC. The quickstart mode.
  • Kubernetes + Helm — recommended production default. Multi-replica, HPA, cluster-native.
  • Terraform modules — AWS + GCP stubs for customer-VPC deployments.
  • Air-gapped — offline / banking-core / defence. Helm chart with pinned images.

Cross-cutting concerns#

Picking a mode#

If you're … Use
Running on a laptop / demo Docker Compose
Standing up a dev / staging env Docker Compose → Kubernetes
Evaluating for an enterprise PoC Kubernetes + Helm in your VPC
Deploying in BFSI / healthcare with compliance sensitivity Kubernetes + Terraform module (customer-VPC)
Running in a zero-egress zone (defense, banking core) Air-gapped Helm with pinned image bundle

Minimum resources#

Mode CPU RAM Storage
Docker Compose (dev) 4 cores 8 GB 20 GB
Kubernetes (single-node) 6 cores 16 GB 50 GB
Kubernetes (multi-replica) 12 cores 32 GB 200 GB (Postgres) + 500 GB object store

Scale up for LLM workloads — the swarm API itself is lightweight, but training workloads + concurrent batch runs can push RAM hard.