Architecture

This page describes swarm's high-level architecture: the crate structure, module map, data flow between components, and key design decisions.

Crate Structure

Swarm is a single-crate Rust project organized as a Cargo workspace with one member:

swarm/                  # Workspace root
├── Cargo.toml          # Workspace manifest (members = ["swarm"])
└── swarm/              # Main crate
    ├── Cargo.toml      # Crate manifest
    └── src/
        ├── lib.rs      # Module declarations and crate-level docs
        ├── main.rs     # Binary entry point (CLI parsing → orchestrator)
        └── ...         # All modules below

The crate exposes a library (swarm_lib) and a binary (swarm). The binary is a thin wrapper that parses CLI arguments and delegates to the orchestrator.

Module Map

ModulePurpose
cliCLI argument parsing via clap (commands, flags, subcommands)
configSettings file loading, validation, and resolution (raw → resolved types)
orchestratorTop-level session lifecycle: start (13-step flow), stop, status
sessionSession ID generation, session.json management, PID-based liveness
agent::stateAgent state machine (AgentState, AgentEvent, SideEffect)
agent::runnerAgent lifecycle loop driver (prompt → spawn → run → repeat)
agent::registryCentral registry of all running agents and their handles
backendAgentBackend trait abstraction for LLM providers (Anthropic, mock)
prompt14-section prompt assembly pipeline (build_prompt())
mailboxSQLite-backed per-agent message broker with threading and urgency
routerAsync message router that polls for urgent messages and sends interrupts
toolsTool trait, ToolRegistry, and all built-in tools
tools::wasmWASM sandboxed tool execution (feature-gated: wasm-sandbox)
permissionsPermission rules, sets, modes, and evaluation logic
skillsSkill discovery, frontmatter parsing, argument substitution, resolution
mcpModel Context Protocol client, transport (HTTP/SSE/Stdio), and manager
hooksHook configuration, event types, and script execution
worktreeGit worktree creation, cleanup, merging, and recovery
tuiTerminal UI application (agent panels, log viewer, event viewer, input)
livenessAgent liveness monitoring (idle nudges, stall detection, warnings)
iterationIteration engine for repeated task-solving loops
workflowWorkflow pipeline definitions and execution
conversationConversation history management
context_windowContext window size tracking and management
supervisorSupervisor agent logic and merge-focused prompt
tasksTask system integration
modesAgent execution modes (code, delegate, etc.)
loggingStructured logging setup
errorsError types for all subsystems
historySession history and archiving

Data Flow

Session Start (13-Step Flow)

CLI (swarm start)
  │
  ├── 1. Load config (~/.swarm/settings.json)
  ├── 2. Validate git prerequisites (version, repo, not detached)
  ├── 3. Handle --init flag
  ├── 4. Handle --stash or require clean working tree
  ├── 5. Check for stale session + recovery
  ├── 6. Create session (session.json + lockfile)
  ├── 7. Create worktrees (one per agent + supervisor)
  ├── 8. Initialize SQLite mailbox database
  ├── 9. Create agent runners + registry
  ├── 10. Start message router (100ms poll loop)
  ├── 11. Start periodic tasks (WAL checkpoint, message prune)
  ├── 12. Launch TUI or headless mode
  └── 13. Await shutdown signal → graceful shutdown

Agent Lifecycle Loop

Each agent runs independently through its state machine:

Initializing → BuildingPrompt → Spawning → Running → SessionComplete
                    ↑                          │            │
                    │                          │            │
                    │    ┌─────── CoolingDown ←┘ (on error) │
                    │    │  (exponential backoff)            │
                    │    ↓                                   │
                    └────┴───────────────────────────────────┘
                                                    (next session)

The runner loop for each agent:

  1. Build prompt — Assembles a 14-section system prompt with environment info, role, tools, pending messages, beads tasks, etc.
  2. Spawn backend session — Sends the prompt to the configured LLM provider (Anthropic API)
  3. Run — The backend session executes, making tool calls that the runner handles
  4. Handle exit — On success, transition to SessionComplete; on error, enter CoolingDown with exponential backoff
  5. Repeat — After cooldown or session complete, rebuild prompt and spawn again

Message Flow

Agent A                    SQLite DB                    Agent B
   │                          │                            │
   ├── send(to=B, body) ─────►│                            │
   │                          ├── INSERT INTO messages ────►│
   │                          │                            │
   │                     Router (100ms poll)                │
   │                          ├── poll_urgent() ───────────►│
   │                          │   (if urgent)     InterruptSignal
   │                          │                            │
   │                          │◄── consume() ──────────────┤
   │                          │   (next prompt build)      │

Shutdown Flow

SIGTERM received (or operator stop)
  │
  ├── Signal all agents: OperatorStop event
  ├── Wait for all agents to reach Stopped state
  ├── Auto-commit any dirty worktrees
  ├── Merge/squash/discard agent branches (based on StopMode)
  ├── Remove worktrees and prune
  ├── Delete session branches
  ├── Remove session.json + lockfile
  └── Exit

Key Dependencies

DependencyUsed For
tokioAsync runtime (ADR-001)
clapCLI argument parsing
serde / serde_jsonConfiguration and message serialization
rusqliteSQLite mailbox (ADR-002)
ratatuiTerminal UI rendering (ADR-007)
tracingStructured logging
reqwestHTTP client for Anthropic API and MCP transports
chronoTimestamp handling
anyhow / thiserrorError handling (ADR-009)
wasmtimeWASM sandbox runtime (optional, feature-gated)
libcProcess liveness checks (kill signal 0)

Design Decisions

The architecture is shaped by several key decisions documented in ADRs:

Component Interactions

                    ┌──────────────┐
                    │     CLI      │
                    └──────┬───────┘
                           │
                    ┌──────▼───────┐
                    │ Orchestrator │──────────────────┐
                    └──────┬───────┘                  │
                           │                          │
              ┌────────────┼────────────┐      ┌──────▼──────┐
              │            │            │      │   Session    │
        ┌─────▼────┐ ┌────▼─────┐ ┌────▼────┐ │  Management │
        │ Agent 1  │ │ Agent 2  │ │ Agent N │ └─────────────┘
        │ Runner   │ │ Runner   │ │ Runner  │
        └────┬─────┘ └────┬─────┘ └────┬────┘
             │            │            │
        ┌────▼────────────▼────────────▼────┐
        │           Agent Registry          │
        └────┬─────────────────────────┬────┘
             │                         │
      ┌──────▼───────┐         ┌──────▼───────┐
      │   Backend    │         │   Mailbox    │
      │  (Anthropic) │         │   (SQLite)   │
      └──────────────┘         └──────┬───────┘
                                      │
                               ┌──────▼───────┐
                               │    Router    │
                               │ (100ms poll) │
                               └──────────────┘