Platform Reference#
Canonical source: docs/REFERENCE.md in the repo. This page is rendered via include-markdown and stays in sync automatically on every push to master.
Comprehensive module-organized reference. Tech stack + rationale, 100% code coverage via module index, feature catalogue, guardrail reference, subsystem composition, lifecycle flows, operational surface, test matrix, data model, glossary. For onboarding narrative, read MASTER_README.md first. For procurement pack, see the security architecture + threat model pages.
Version: v0.12.0 (first signed release · 2026-04-22) Scope: every subsystem, every module, every feature, every tech-stack choice with rationale. 100% code coverage via the module index in Part 3. Cross-references the rest of the documentation (threat model, CAIQ Lite, training modules, implementation phase docs) rather than duplicating them.
Treat this as the reference manual. For onboarding narrative, read
MASTER_README.mdfirst + the training curriculum at.project/training/. For procurement, read.project/security/. For API surface, the auto-generated OpenAPI lives at/docson a running instance.
Table of contents#
- Product overview
- Tech stack + rationale
- Module index (100% code coverage)
- Feature catalogue
- Guardrail reference (15 controls)
- Subsystem composition (how pieces fit together)
- Lifecycle flows
- Operational surface
- Test matrix
- Data model
- Glossary
1. Product overview#
1.1 What swarm is#
A VPC-installable multi-agent ML platform for regulated mid-market BFSI. 40 domain-specialised AI agents across 7 teams run the full ML lifecycle (data → train → eval → deploy → audit) while producing the evidence BFSI regulators require: fairness reports, drift monitors, model cards, tamper-evident audit PDFs, signed conversation JSONLs, right-to-be-forgotten receipts, and supply-chain-verifiable releases.
1.2 What swarm is NOT#
- Not an AutoML cloud service. Customer runs it in their own VPC; data never leaves.
- Not a generic agent framework. 40 agents have ML-specific roles baked in.
- Not a dev-tool / IDE agent. MLOps workflows only.
- Not SaaS multi-tenant. v0.12.0 is single-tenant per customer; multi-tenancy is deferred.
1.3 Target users#
| Persona | Primary use | Primary artefact they consume |
|---|---|---|
| BFSI Head of Risk | Approve deployments, review audit PDFs | Signed audit PDF + model card |
| BFSI data scientist | Submit problem statement, tune agents | Dashboard + REST + conversation trail |
| Platform operator / ML engineer | Install + maintain in VPC | /transparency dashboard, feature flags, CLI |
| Compliance auditor (internal) | Pull denial logs, fairness reports | permission_denials SQLite + retention log |
| External auditor (RBI) | Verify lifecycle compliance | Audit PDF + signed conversation JSONL |
| CISO (procurement) | Evaluate for purchase | .project/security/ pack |
1.4 Current status (v0.12.0)#
- 628 → 1258 tests across the 12-week arc; zero regressions.
- 15 runtime guardrails shipped across 6 categories.
- Customer-composable platform: three-layer architecture (core + lib + deployments) fully wired.
swarm deployship pipeline:new / validate / ship / whitepaperCLI with build-time per-customer isolation.- Signed release: Cosign-keyless via Sigstore; CycloneDX SBOM; SAST in CI (bandit HIGH gate + semgrep ERROR gate); one-command customer verification.
- BFSI procurement pack: architecture doc + STRIDE threat model + pre-filled CAIQ Lite (75% Y/Y+P).
Full release notes in CHANGELOG.md.
2. Tech stack + rationale#
Every choice below is deliberate. Alternatives considered, decision criteria, and a one-line rationale.
2.1 Language + runtime#
| Layer | Choice | Why |
|---|---|---|
| Primary language | Python 3.11+ (tested on 3.11, 3.12) | ML ecosystem (sklearn / XGBoost / LightGBM / SHAP / fairlearn) is Python-first. BFSI customer teams are Python-literate. Type hints (3.11+ PEP-585 / PEP-604) give meaningful IDE support without a Rust / Go rewrite. |
| Async runtime | asyncio + FastAPI | Single-threaded event loop is fine for our IO-bound workload (LLM HTTP calls dominate latency). Starlette lifespan gives clean startup/shutdown hooks. |
| Dashboard | Next.js 15 + React 19 + TypeScript | SSR-first for first-paint performance + SEO; server components for data fetching without client-side boilerplate. React is the BFSI ops-team default. |
| Package manager | pip + pyproject.toml (PEP-621) | No new dep needed. Compatible with Poetry, hatch, uv if customers prefer. Hash-pinning migration via pip-tools planned. |
Alternatives rejected: - Rust backend — zero community overlap with BFSI ML teams; would require the founder to double hire. - Node.js backend — Python ML bindings don't exist at parity. - FastAPI vs Flask / Django — FastAPI's Pydantic validation + OpenAPI generation + async is strictly superior for an API-first product. Django's ORM + admin would be useful but the friction isn't worth the 20% value.
2.2 Web framework + API#
| Component | Choice | Why |
|---|---|---|
| HTTP framework | FastAPI 0.115+ | Pydantic-native (v2.9+), auto-OpenAPI, async + sync both first-class, dependency injection pattern maps cleanly to our RBAC / rate-limit / permission-engine guards. |
| Validation | Pydantic v2.9+ | Rust-accelerated validators, extra="forbid" for strict schemas, JSON-schema generation for lib manifest files. |
| Authentication (JWT) | authlib.joserfc | Modern Python JWT lib; migration from python-jose (two unpatched CVEs) tracked in .project/decisions.md. |
| OIDC | Authlib | Okta / Azure AD / Google Workspace compatible out-of-box. Supports PKCE + state cookie. |
| Password hashing | bcrypt (via passlib) | 12 rounds default. Industry standard for password-at-rest. |
2.3 Storage#
| Purpose | Choice | Why |
|---|---|---|
| Primary data store | SQLite (WAL mode, FK enforced) | Zero-install, good enough for single-node BFSI deployments. WAL mode gives read-during-write concurrency. Postgres migration path documented when a customer's concurrency pattern exceeds SQLite's single-writer limit. |
| Conversation persistence | JSONL per agent | Append-only writes, thread-safe per file, streaming reads, trivially portable. Aligned with OpenTelemetry GenAI semantic conventions + Langfuse/LangSmith patterns. |
| Cross-run memory | SQLite cross-run + Postgres cross-project | SQLite persists lessons across runs in the same work_dir; Postgres (optional) for cross-project memory in multi-project orgs. |
| RAG | ChromaDB (with keyword-search fallback) | Embedded vector store; no separate server process. Keyword-search fallback means the recall_past_runs tool works even when ChromaDB isn't installed. |
| Encryption at rest | SQLCipher (opt-in) / Postgres + pgcrypto (BFSI) / host-FS (LUKS / EBS KMS) | Tiered: host-FS is the default; SQLCipher for tenants that want app-layer encryption without switching to Postgres; Postgres + pgcrypto for BFSI customers with existing Postgres KMS integrations. |
2.4 LLM providers#
| Provider | Support | Default models |
|---|---|---|
| Anthropic | First-class | claude-sonnet-4.6 (workers), claude-opus-4 (directors) |
| OpenAI | First-class | gpt-4o (workers), o1 (directors) |
| vLLM (local GPU) | First-class — BFSI air-gap path | Any HuggingFace-compatible model served via vLLM's OpenAI API shim |
| Ollama | Supported | Local quantised models |
| Single-model override | via CLI flag | One model for all agents (dev/testing) |
Rationale: BFSI customers in India + similar jurisdictions often prefer on-prem / air-gapped. vLLM + Ollama give them a zero-egress path. SaaS customers use Anthropic or OpenAI for frontier performance. Agent-level provider selection lets a single pipeline mix (e.g., small local model for data-profiler, Claude for ML director).
2.5 ML toolkit#
| Tool | Choice | Why |
|---|---|---|
| Classification | LightGBM (primary), XGBoost, scikit-learn RandomForest/Logistic | LightGBM is the BFSI default (faster, better tabular performance than XGBoost at equal accuracy; lower memory). XGBoost for customers who prefer it. sklearn for simple baselines. |
| Fairness audit | fairlearn | Industry-standard. Computes demographic parity, equal opportunity, equalized odds per protected attribute group. |
| Explainability | SHAP (tree-native fast path) | De-facto standard for tree-model interpretability. Tree-path exploitation is ~100× faster than generic KernelExplainer. |
| Drift detection | Custom (PSI + KS + chi²) | ml_team/tools/drift.py; BFSI thresholds: PSI 0.10 warn / 0.25 alert. Implementation matches RBI guidance. |
| Experiment tracking | MLflow (optional) | Opt-in integration via ml_team/tools/mlflow_tools.py. |
| Model packaging | Docker + K8s (Deployment + Service + HPA) | Non-root containers, readOnlyRootFilesystem, HEALTHCHECK. BFSI-hardened defaults. |
2.6 Agent orchestration#
| Component | Choice | Why |
|---|---|---|
| Orchestration pattern | Supervisor-worker ReAct (custom, not LangChain) | Full control over the loop, tool dispatch, permission engine integration. Two implementations: native (ml_team/core/orchestrator.py) + LangGraph (ml_team/backends/langgraph_backend.py). Same agent definitions work with both. |
| Delegation | transfer_to_* tool calls |
Supervisor delegates to sub-agents via synthesized tool calls. G10 guardrail enforces depth ≤5 and fan-out ≤50. |
| MCP client | stdlib via JSON-RPC over stdio/SSE | Claude Code marketplace compatibility. ml_team/core/mcp_client.py. |
| Plugin loader | Claude Code plugin manifest format | Reuses the community's plugin ecosystem. Skills + MCPs + hooks + commands + agents all installable. |
2.7 Cryptography#
Every crypto choice documented in .project/security/architecture.md § Cryptography inventory. Summary:
| Use | Algorithm | Library | Key length |
|---|---|---|---|
| Envelope DEK | AES-GCM-256 | cryptography | 256-bit + 96-bit nonce + 128-bit tag |
| BYOK wrap | AWS KMS (default), env AES-GCM (dev), stub (tests) | AWS CLI / cryptography | customer-owned KEK |
| Audit signing | Ed25519 or ECDSA-P256 | cryptography / cosign | 256-bit |
| Release signing | Cosign keyless (ECDSA-P256) | cosign binary | Sigstore-ephemeral |
| Password hash | bcrypt | passlib | 12 rounds |
| Hash | SHA-256 | hashlib | — |
| JWT | HS256 | authlib.joserfc | 256-bit secret (rotatable) |
| Entropy detection (G6) | Shannon base-2 | builtin | threshold 4.2 bits/char |
No deprecated algorithms in use: no MD5 for auth, no SHA-1 for signing, no RC4/DES/3DES anywhere.
2.8 Observability#
| Layer | Choice | Why |
|---|---|---|
| Metrics | Prometheus (28+ counters/histograms) | Industry standard; customer's existing Prometheus scrapes our /api/v1/metrics. In-memory fallback when prometheus_client absent. |
| Tracing | OpenTelemetry | Parent/child spans across agent-tool boundaries + token/cost metadata. Exportable to Jaeger / Honeycomb / Datadog. |
| Logs | Structured JSON | ready for Loki / Splunk / CloudWatch shipping. G6 credential filter at root logger scrubs secrets before any log lands on disk. |
| Real-time dashboard updates | WebSocket | push-based vs polling reduces dashboard latency. |
2.9 Supply chain#
| Concern | Tool | Why |
|---|---|---|
| Release signing | Cosign (Sigstore) + Rekor | Keyless via GitHub OIDC is the most CISO-defensible path: no long-lived signing keys, append-only transparency log, customer can verify offline with one command. |
| SBOM | CycloneDX 1.5 JSON via cyclonedx-bom |
Actively maintained Python tooling; compatible with every enterprise vendor-risk scanner. |
| Commit signing | SSH or GPG signing required on main | G17 CI gate enforces git log %G? non-N non-B on release range. |
| Static security analysis | bandit (HIGH gate) + semgrep p/python + p/security-audit (ERROR gate) | Two complementary rulesets: bandit for Python-specific patterns (hardcoded temp files, scoped eval), semgrep for CWE-based coverage. Both non-blocking on warnings/mediums to avoid CI flakiness; HIGH/ERROR blocks the merge. |
| Dependency monitoring | GitHub Dependabot | Free, maintained by GitHub. CVE alerts feed Dependabot PRs. |
| Pen test | Scheduled Q2 2026 (Lucideus / Cobalt) | External validation complement to internal STRIDE review. |
2.10 Secrets#
| Concern | Tool | Why |
|---|---|---|
| Development | .env files + direnv |
Standard; simple. |
| CI | GitHub Actions Secrets + OIDC (for Sigstore) | Scoped, no long-lived keys in repo. |
| Production (per-customer) | Doppler (per-customer project: swarm-<customer>-{dev,staging,prod}) |
Managed SaaS, free tier, per-environment configs, rotation UI. SOPS + age as self-hosted fallback documented. |
| BYOK | customer's own KMS (AWS / GCP / Vault) | We never hold the customer's KEK. Envelope wraps DEKs only. |
2.11 Test tooling#
| Tool | Purpose | Scope |
|---|---|---|
| pytest | Test runner | 1258 tests |
| pytest-asyncio | async test support | FastAPI TestClient tests |
| pytest-subtests | parametrised | PII detectors (50 entities × 10 types) |
| pytest-cov | coverage | Uploaded from CI |
| FastAPI TestClient | API integration tests | Subjects, auth, pipelines |
| monkeypatch | External-tool fakes | cosign, nsjail, docker, aws CLI not installed on dev machines |
3. Module index (100% code coverage)#
Every .py file in the product tree with a one-line purpose and public-surface summary. Tests directory + dashboard TSX not listed (see their own IMPLEMENTATION_README.md).
3.1 ml_team/core/ — engine (35 top-level modules + 2 sub-packages)#
Agent runtime
- interfaces.py — AgentConfig, AgentRunResult, Tool dataclasses. The contract between agent runtime and everything else.
- agent_runner.py — per-agent ReAct loop. Single entry: AgentRunner.run(messages).
- orchestrator.py — supervisor loop with transfer_to_* tool-call handoff.
- agent_composer.py — resolves extends: lib/agents/base/<id>@vX.Y.Z references + applies overlays (system prompt append/prepend, tool additions, knowledge bases). Produces ComposedAgent.
- team_factory.py — builds the director/coordinator/worker hierarchy from team_definitions.yaml.
- llm_client.py — provider-agnostic LLM call wrapper. Anthropic / OpenAI / vLLM / Ollama / single-model. HTTP pooling + retries.
Permission engine (W7-1)
- permissions.py — PermissionRule, PermissionContext, PermissionDecision, PermissionBehavior enum; check(ctx) tier-aware resolver (ALLOW > DENY > ASK > default) with priority tiebreaks.
- permission_sources.py — 8 rule sources registered by init_default_sources: RBAC, allowlist, feature-flag, HITL, policy, profile, compliance_gate, egress_allowlist.
- permission_audit.py — record_denial(decision, ctx) → permission_denials SQLite table + ml_team_permission_decisions_total counter.
Hook lifecycle (W7-2)
- hooks.py — HookEvent enum (10 values: SESSION_START, PRE/POST_TOOL, PRE/POST_COMPACTION, PRE_LLM, POST_LLM, STORAGE_WRITE, LLM_CALL_WRAPPER, AGENT_DELEGATE), registration + dispatch. Plugin-loaded hooks compose with core hooks.
- shell_hook_runner.py — executes {"type": "command", ...} hooks from plugins behind plugin_shell_hooks_enabled flag; rlimits on Linux, invoke-time validation, per-execution audit row.
Plugin ecosystem
- plugin_loader.py — installs Claude Code marketplace plugins; scans hooks/, commands/, agents/, skills/, .mcp.json. Records drops.
- commands_registry.py — scans commands/*.md + $ARGUMENTS substitution.
- agents_registry.py — scans agents/*.md; force-namespaces as plugin-<name>::<agent>.
- skill_registry.py — scans skills/*.md, injects at appropriate agent boundaries.
- mcp_client.py — JSON-RPC over stdio/SSE for MCP servers.
Tool dispatch
- tool_executor.py — dispatches tool calls through the permission engine; POST-dispatch hook calls compliance_gates.record_tool_result.
Persistence + state
- conversation_store.py — per-agent JSONL write with buffered flushes. _flush_locked now invokes conversation_scrubber.scrub_line (G5) when configured.
- memory.py — per-run JSON memory; OpenTelemetry spans for every operation.
- org_memory.py + project_memory.py — cross-run + cross-project memory backed by SQLite + optional Postgres.
- state_graph.py — pipeline-graph state tracking for the dashboard graph view.
- retention.py — 24h daemon sweep; deletes artefacts past retention_overrides TTLs.
- rag.py — ChromaDB + keyword-search fallback.
Optimization primitives
- context_compaction.py — at 80% window, summarise prior turns. Keeps long runs alive.
- batch.py + batch_processors.py — JSONL → inference / echo / custom processors; checkpoints every 10 records; resume-on-restart (W7-4).
Ops primitives (W7)
- cron.py + cron_tasks.py — 60s tick scheduler; 4 task kinds (retrain / drift_check / audit_pdf / custom). File-backed store at ~/.swarm/cron/jobs.json.
Evaluation
- evaluator.py — optional per-agent rubric grading with generator/evaluator separation.
- evaluation.py — programmatic eval harness called from the evaluations router.
Configuration + feature flags
- feature_flags.py — INVARIANT / FLAG / USER_OVERRIDE tiers; runtime overrides for admin via features router.
- profile_loader.py — resolves based_on: lib/templates/<id>@vX.Y.Z into a LoadedProfile (permission rules + compliance gates + retention overrides).
- deployment_loader.py — reads SWARM_DEPLOYMENT=<path>; assembles LoadedDeployment (agents + workflows + branding + profile). Called from lifespan at API start.
- compliance_gates.py — ComplianceGate dataclass; post-tool-dispatch hook records results; check_blocked(run_id, tool_name) returns (blocked, reason) for the permission engine's compliance_gate_source. Restricted eval() for deny_if expressions.
- lib_loader.py — resolves lib/<kind>/<id>@vX.Y.Z references. Cached by mtime; CLI at python -m ml_team.core.lib_loader {validate,list}.
- lib_schemas.py — Pydantic manifest models for every /lib asset type: AgentManifest, ToolManifest, WorkflowManifest, GuardrailManifest, ProfileManifest, DeploymentConfig, + PermissionRuleSpec, ComplianceGateSpec, etc.
Guardrails subsystem (v0.12.0 Track 2)
guardrails/ sub-package (5 modules):
- guardrails/types.py — IntegrationPoint (10 values), GuardrailOutcome, Severity, GuardrailResult.
- guardrails/registry.py — thread-safe GuardrailRegistry; @register(point, id, priority) decorator.
- guardrails/evaluator.py — evaluate(point, payload) priority-sorted pass; ALLOW chain / REDACT threads / DENY short-circuit / ERROR fails open unless invariant.
- guardrails/metrics.py — guardrail_triggered_total, guardrail_evaluation_duration_seconds, bypass counter.
Top-level guardrail modules:
- egress_allowlist.py — G1 permission source; URL walker + host classifier (RFC1918 via ipaddress, fnmatch + suffix).
- python_sandbox.py — G2 SandboxDriver Protocol + 3 drivers (NsjailDriver, DockerDriver, SubprocessDriver); configure(strict=True) fails boot if driver unavailable.
- pii/ sub-package — G4 core:
- pii/types.py — PiiEntity (12 types), PiiFinding, AnonymizeAction.
- pii/regex_detectors.py — scan_regex() with Luhn + Verhoeff.
- pii/anonymizer.py — right-to-left applier.
- pii/presidio_shim.py — lazy Presidio wrapper; PresidioUnavailable(ImportError).
- conversation_scrubber.py — G5 scrub_line() + recursive value walk.
- audit_signer.py — G11 SignerDriver Protocol + 4 drivers + SignatureReceipt sidecar.
- encryption.py — G12+G13 KeyProvider Protocol (5 implementations) + envelope encrypt/decrypt; WrappedDek, Ciphertext dataclasses.
- lineage.py — G14 record_* upserts + chain_for_deployment + subjects_in_dataset (used by G15).
- rtbf.py — G15 ErasureReceipt + erase_subject + JSONL tombstone walker; regex-escaped subject matching.
- hitl_sweep.py — G16 pure sweep(store, notifier, now) called from cron tick.
- guardrail_bootstrap.py — auto-configures 6 configurable guardrails from deployment guardrail_configs block. BootstrapReport + BootstrapError.
Logging
- logging_config.py — structured JSON logging; installs G6 credential filter at root logger via importlib.util.
Legacy / domain validators
- guardrails.py — pre-Track-2 validators (model-card shape, deployment-artefact completeness) + legacy check_output PII/secret scrub. Preserved alongside Track 2 additions; removal scheduled after customer confirms new coverage.
- feedback.py — analyze_run, save_feedback, generate_lessons_learned, record_run_to_project_memory.
- types.py — shared dataclasses (ConversationMessage, etc.).
- telemetry.py — OpenTelemetry tracer setup.
- graph_executor.py — directed graph executor for the default_ml_pipeline and friends.
- approval.py — 6-type HITL gate primitives. v0.12.0 added G16 TTL/escalation fields.
3.2 ml_team/api/ — HTTP surface (7 top-level + 20 routers)#
| File | Purpose |
|---|---|
app.py |
FastAPI application + lifespan (init_db, users, default permission sources, guardrail_bootstrap, retention, cron). 23 routers mounted. |
auth.py |
Dual auth middleware (JWT Bearer + X-API-Key); require_role(min_role) dependency factory. |
users.py |
User dataclass + Role enum (viewer/operator/admin); bcrypt hashing; bootstrap from env. |
oidc.py |
Authlib-based OIDC client (Okta / Azure AD / Google Workspace); PKCE + state cookie. |
rate_limit.py |
G7 middleware with composite (caller_identity, endpoint_class) key + per-role limits. |
database.py |
SQLite schema + _get_conn() thread-local pool; init_db, flush_events. Includes G14 lineage tables + G15 consent_doc_ref index. |
metrics.py |
Prometheus counter/histogram declarations + text-format renderer. |
Routers (under ml_team/api/routers/):
| Router | Auth | Endpoints |
|---|---|---|
auth.py |
— (login is chicken-and-egg) | POST /auth/login, POST /auth/refresh, GET /auth/oidc/login, GET /auth/oidc/callback |
docs.py |
public | In-app docs browser (static content) |
config.py |
public (SSR-fetched) | GET /config/branding — deployment-driven dashboard config |
pipelines.py |
auth | POST /pipelines, GET /pipelines, GET /pipelines/{id}, GET .../trace, GET .../conversations, GET .../graph, WS .../ws |
agents.py |
auth | GET /agents, GET /agents/{name} |
models.py |
auth | Algorithms + registered models |
evaluations.py |
auth | GET /evaluations/cases, POST /evaluations/generate |
mcp.py |
auth | MCP registry + tool-call inspection |
knowledge.py |
auth | RAG corpus management |
chat.py |
auth | Conversation-style chat |
datasets.py |
auth | Dataset upload + management |
inference.py |
auth | Trained-model inference |
deployments.py |
auth | Champion-challenger deployments |
features.py |
auth | Feature flag admin |
plugins.py |
auth | Plugin install / inspect |
permissions.py |
auth | GET /permissions/denials?since=&tool=&agent= |
cron.py |
auth | Scheduler CRUD |
batch.py |
auth | Batch runs |
subjects.py |
admin only (RBAC) | G15: GET /subjects/{id}/preview, DELETE /subjects/{id} |
3.3 ml_team/tools/ — the 25 native tools (19 files)#
| File | Tools it exports |
|---|---|
training.py |
train_classifier (LightGBM / XGBoost / RandomForest / Logistic) with stratified split + metrics sidecar |
ml_tools.py |
Shared ML helpers (preprocessing, column-type detection) |
drift.py |
detect_drift (PSI + KS + chi²) |
fairness.py |
audit_fairness (fairlearn MetricFrame) |
explainability.py |
explain_model (SHAP tree-native + generic fallback) |
model_card.py |
generate_model_card (Markdown from metrics + fairness + drift + SHAP) |
deploy.py |
package_model (Dockerfile + serve.py + requirements), generate_k8s_manifests (BFSI-hardened) |
champion_challenger.py |
register_model_deployment, log_shadow_prediction, compare_champion_challenger |
audit_pdf.py |
export_audit_report (reportlab, tamper-evident SHA-256, G11 signing hook) |
execution.py |
execute_python (G2 sandbox when configured), execute_shell (allowlist) |
file_ops.py |
read_file, write_file, list_directory |
http_tools.py |
http_request, health_check (G1-guarded) |
docker_tools.py |
Docker operations for deployment |
git_ops.py |
clone_repo, checkout |
gpu_tools.py |
check_gpu, measure_gpu_usage |
memory_tools.py |
recall_past_runs, save_agent_learning |
mlflow_tools.py |
log_run_to_mlflow (optional) |
search.py |
Web search (DuckDuckGo fallback) |
__init__.py |
TOOL_SETS registry, per-agent allowlists |
3.4 ml_team/deploy/ — ship pipeline (5 modules + __init__)#
See ml_team/deploy/README.md for detail. Summary:
| File | Purpose |
|---|---|
scaffold.py |
swarm deploy new — new_deployment(customer, template) → ScaffoldResult |
validator.py |
swarm deploy validate — lint config + lib refs |
manifest.py |
MANIFEST.yaml generator with pinned lib versions + config SHA-256 |
ship.py |
swarm deploy ship — tarball with positive-list build-time filter |
whitepaper.py |
5-section security whitepaper generator |
3.5 ml_team/backends/ — pipeline execution (4 modules)#
| File | Purpose |
|---|---|
native_backend.py |
Direct orchestration using ml_team/core/orchestrator.py. Primary path. |
langgraph_backend.py |
LangGraph supervisor execution with the same agent configs. |
crewai_backend.py |
CrewAI adapter (legacy; primarily for migration paths). |
Runtime selection via backend= on POST /pipelines.
3.6 ml_team/config/ — agent + workflow definitions#
| File | Purpose |
|---|---|
agent_defs.py |
147-line PEP-562 __getattr__ shim. AGENT_DEFS etc. resolved lazily from lib/agents/base/. Pre-P0 was 906 lines of Python literals. |
agent_rules/ |
Deprecated; colocated with each agent now. |
model_endpoints.py |
Per-provider default endpoints + model mappings. |
message_trimmer.py |
Legacy message-size trimming (superseded by context_compaction). |
3.7 lib/ — library shelf (14 subdirectories, 80+ YAML files)#
lib/
├── _schema/ JSON schemas regenerated from lib_schemas.py (7 files)
├── agents/base/ 40 agents, each a YAML manifest
├── agents/teams/ team definitions
├── tools/ 25 tools, each with manifest + impl.py
├── workflows/ 3 pipelines (default_ml, fast_prototype, parallel_research)
├── mcps/ MCP integration manifests
├── skills/ Reusable analysis/drafting patterns
├── permission_rules/ Reusable rule sets
├── compliance_gates/ Reusable runtime gates
├── guardrails/ 15 guardrails across 6 categories (v0.12.0)
└── templates/
├── generic_ml/ baseline no-compliance template
├── bfsi_baseline/ 12 guardrails invariant + RBI FREE-AI rules
└── hipaa_baseline/ stub (schema-ready)
Lib assets identified by lib/<kind>/<id>@vX.Y.Z; loader resolves + caches.
3.8 scripts/ — utility scripts (4 files)#
| File | Purpose |
|---|---|
gen_lib_schemas.py |
Regenerate JSON Schemas from Pydantic lib_schemas.py; CI drift guard via --check. |
gen_sbom.py |
G17 wrapper around cyclonedx-py CLI for CI SBOM generation. |
migrate_pipelines.py |
P1 migration helper (pipeline YAML shape); now drift-guard only. |
download_models.py |
Legacy model-weight downloader. |
3.9 .github/workflows/ — CI (5 workflows)#
| Workflow | Purpose |
|---|---|
ci.yml |
ruff + mypy lint, pytest (py3.11 + py3.12 matrix, 1258 tests), bandit HIGH gate + semgrep ERROR gate |
release-supply-chain.yml |
G17 — signed-commit gate + CycloneDX SBOM + Cosign keyless signing + base-image verify + GitHub Release upload |
doc-drift.yml |
Advisory check: subsystem IMPL/LEARNING README freshness |
bench-nightly.yml |
Nightly performance baselines |
nightly-e2e.yml |
Real-LLM golden-path end-to-end |
3.10 .project/ — documentation (this directory)#
| Path | Purpose |
|---|---|
project.yaml |
Static project metadata + governance contacts |
architecture.md |
Living architecture doc |
decisions.md |
ADR log (every non-trivial decision) |
journal.md |
Chronological narrative; appended on /done |
security/ |
BFSI procurement pack (architecture, threat model, CAIQ Lite, signing setup) |
training/ |
10-module onboarding curriculum (8 modules written as of v0.12.0) |
implementation/ |
27 per-phase implementation notebooks (Track 1 + Track 2 + ship tooling) |
research/ |
Stack audit + empirical research notes |
strategy/ |
GTM + positioning |
3.11 ml_team/dashboard/ — Next.js 15 dashboard#
TypeScript + React 19 + server components. See ml_team/dashboard/{IMPLEMENTATION,LEARNING}_README.md. Key surfaces:
/login— OIDC + username/password/— Pipelines list + live stream/pipelines/[id]— Conversation tree + trace + cost breakdown/deployments— Champion/challenger management/agents— 40-agent roster with tool sets/transparency— Permission denials + retention + cron + batch/cron— Scheduler/plugins— Marketplace install / inspect/knowledge— RAG corpus/settings— Feature flags
Branding (product_name / logo / badges) fetched at SSR via GET /config/branding; deployment-driven.
4. Feature catalogue#
Features grouped by customer-visible capability.
4.1 Pipeline orchestration#
- 40-agent hierarchy across 7 teams (director → coordinator → worker)
- 3 pre-built pipeline configs (
fast_prototype,default_ml_pipeline,parallel_research) + customer-authored YAML - Dual backends (native + LangGraph), same agent definitions
- G10 circuit breaker: depth ≤5, fan-out ≤50
- Per-agent tool allowlist
- Evaluator/generator separation (opt-in per-agent rubric grading)
- Context compaction at 80% window
4.2 ML toolkit#
train_classifier— LightGBM / XGBoost / RandomForest / Logistic, stratified split, metrics sidecar, optional MLflowdetect_drift— PSI + KS + chi², BFSI thresholdsaudit_fairness— fairlearn MetricFrame, per-group metrics, binary disparate-impactexplain_model— SHAP TreeExplainer (fast path) + generic fallbackgenerate_model_card— Markdown from training + fairness + drift + SHAPpackage_model— Dockerfile + FastAPI serve.py + requirements + optionaldocker buildgenerate_k8s_manifests— Deployment + Service + HPA with BFSI-hardened defaults- Champion-challenger registry — atomic promotion, shadow prediction log, configurable agreement thresholds
4.3 HITL (human-in-the-loop)#
- 6 approval gate types: deploy, data request, manual, security, cost, custom
- Pipeline pauses + state persists + resume from checkpoint
- Real-time dashboard surface of pending gates
- G16 TTL + escalation via cron sweep
4.4 Auth + RBAC#
- 3 roles: viewer / operator / admin
- JWT Bearer (HS256, 24h TTL)
- OIDC SSO (Okta / Azure AD / Google Workspace, PKCE + state cookie)
- Legacy X-API-Key (admin-equivalent)
- Admin bootstrap from env vars
- Per-endpoint role guards routed through the permission engine
4.5 Knowledge + memory#
- 3-tier memory: per-run JSON → cross-run SQLite → cross-project Postgres (optional)
- RAG via ChromaDB (keyword-search fallback)
- Agents learn from past runs (
recall_past_runs,save_agent_learning)
4.6 Observability#
- Prometheus metrics on every subsystem (28+ counters/histograms)
- OpenTelemetry spans (parent/child, tokens, costs)
- Real-time WebSocket streaming of pipeline events
- Per-agent conversation JSONL
- Structured JSON logs (G6 credential filter at root)
4.7 Customer composability (Track 1)#
- Three-layer architecture:
core/+lib/+deployments/<customer>/ - Agent composition via
extends: lib/agents/base/<id>@vX.Y.Z+ overlays SWARM_DEPLOYMENTenv var selects active deployment at boot- BFSI baseline template with real RBI FREE-AI rules + fairness gate + drift gate + 2555d retention
- Invariant-DENY floor: profile rules at priority 60 beat operator POLICY ALLOW at 50
4.8 Runtime guardrails (Track 2)#
15 guardrails detailed in § 5.
4.9 swarm deploy ship pipeline#
swarm deploy new— scaffold from templateswarm deploy validate— lint config + lib refsswarm deploy ship— tarball + MANIFEST + whitepaper with build-time per-customer isolationswarm deploy whitepaper— emit standalone markdown whitepaper- G17 CI release workflow — signed-commit + SBOM + Cosign keyless + base-image verify
4.10 Audit + compliance#
- Tamper-evident audit PDF with SHA-256 of source bundle on cover
- G11 Cosign signing (4 drivers)
- G14 data lineage embedded in audit PDF
- G15 right-to-be-forgotten with signed receipts
- 2555-day retention for BFSI; configurable per-artefact class
permission_denialsSQLite ledger, source-attributed
4.11 Ops primitives (W7)#
- Cron scheduler (4 task kinds + custom)
- Batch runner (JSONL → processor → results.jsonl with checkpointing)
- Retention daemon
- Feature-flag registry with 3 tiers (INVARIANT / FLAG / USER_OVERRIDE)
4.12 Plugin ecosystem#
- Claude Code marketplace-compatible
- Install-time manifest validation
- Skills, MCPs, hooks (Python + shell), commands, agents
- Namespace-isolated agents (
plugin-<name>::<agent>) - Shell-hook feature-flag gated, rlimited, per-execution audit
5. Guardrail reference (15 controls)#
Full per-guardrail detail with integration points, config schemas, file paths, and honest ceilings lives in .project/training/modules/06-guardrails-catalogue.md. Summary table:
| # | Guardrail | Category | Integration | Files |
|---|---|---|---|---|
| G1 | egress_allowlist | network | permission source | lib/guardrails/network/egress_allowlist/ + ml_team/core/egress_allowlist.py |
| G2 | python_sandbox | execution | wraps execute_python |
lib/guardrails/execution/python_sandbox/ + ml_team/core/python_sandbox.py + ml_team/tools/execution.py |
| G3 | prompt_injection_heuristic | input_safety | PRE_LLM priority 70 |
lib/guardrails/input_safety/prompt_injection_heuristic/ |
| G4 | regex_pii + presidio_pii | pii | POST_LLM + POST_TOOL + STORAGE_WRITE |
ml_team/core/pii/ + lib/guardrails/pii/ |
| G5 | conversation_scrubber | persistence | ConversationStore._flush_locked |
lib/guardrails/persistence/conversation_scrubber/ + ml_team/core/conversation_scrubber.py |
| G6 | logs_credential_filter | persistence | root logger | lib/guardrails/persistence/logs_credential_filter/ + ml_team/core/logging_config.py |
| G7 | per_user_rate_limit | rate_limits | API middleware | ml_team/api/rate_limit.py |
| G10 | delegation_loop_detector | agent_safety | AGENT_DELEGATE priority 80 |
lib/guardrails/platform_integrity/delegation_loop_detector/ |
| G11 | audit_pdf_signing | integrity | export_audit_report post-render |
lib/guardrails/integrity/audit_pdf_signing/ + ml_team/core/audit_signer.py |
| G12 | encryption_at_rest | persistence | module-level | ml_team/core/encryption.py |
| G13 | byok_kms | persistence | KeyProvider Protocol |
ml_team/core/encryption.py |
| G14 | data_lineage | platform_integrity | recording helpers | ml_team/core/lineage.py + ml_team/api/database.py |
| G15 | right_to_be_forgotten | persistence | DELETE /subjects/{id} |
ml_team/core/rtbf.py + ml_team/api/routers/subjects.py |
| G16 | hitl_timeout_escalation | agent_safety | approval.py + hitl_sweep.py |
— |
| G17 | sbom_signed_commits | platform_integrity | CI workflow | .github/workflows/release-supply-chain.yml + scripts/gen_sbom.py |
Bootstrap: ml_team/core/guardrail_bootstrap.py::bootstrap_from_deployment() auto-configures the 6 configurable guardrails from deployments/<customer>/config.yaml::guardrail_configs at API startup.
6. Subsystem composition (how pieces fit together)#
Reference diagram in .project/security/architecture.md. Summary of composition contracts:
6.1 Permission engine ↔ guardrails#
The permission engine is the bouncer at every tool dispatch. Guardrails like G1 (egress allowlist), G7 (rate limit), G10 (delegation-loop), and the compliance gates plug in as rule sources — functions (PermissionContext) → list[PermissionRule] registered via init_default_sources. Other guardrails (G3 prompt injection, G4 PII, G5 scrubber) run at integration-point hooks registered with ml_team.core.guardrails.registry.
Two parallel pipelines, different semantics:
- Rule-source pipeline: emits typed rules, tier-sorted, winner picked via ALLOW > DENY > ASK > default.
- Hook-integration-point pipeline: handlers run priority-ordered, ALLOW chains, REDACT threads the payload, DENY short-circuits.
6.2 Profile loader ↔ deployment loader#
SWARM_DEPLOYMENT=deployments/hdfc_bank at boot triggers deployment_loader.load():
1. Reads config.yaml → validates as DeploymentConfig (Pydantic strict).
2. Resolves based_on: lib/templates/bfsi_baseline@v1.0.0 → profile_loader.load(ref) → LoadedProfile (permission rules + compliance gates + retention overrides).
3. Composes customer agents via agent_composer (extends + overlays).
4. Registers active profile with permissions.profile_source + compliance_gates.set_active_gates.
5. Calls guardrail_bootstrap.bootstrap_from_deployment() to configure G1/G2/G4/G5/G11/G12+G13.
6.3 Tools ↔ permission engine ↔ guardrails#
A tool call (e.g., http_request({url: ...})) flows:
AgentRunner.run()
↓ decides to call tool
ToolExecutor.execute(tool_name, args, ctx)
↓
permissions.check(ctx) ← permission engine + rule sources
↓ if ALLOW
evaluator.run(PRE_TOOL, payload=args) ← guardrail registry (if registered)
↓ if ALLOW/REDACT
tool implementation runs
↓
evaluator.run(POST_TOOL, payload=result) ← G4 PII scrub
↓
compliance_gates.record_tool_result(run_id, tool, result)
↓
return final result
6.4 Conversation store ↔ scrubber#
ConversationStore.record_message(msg) buffers per-agent. _flush_locked(agent) runs at 10 messages or 1 second; each line passes through conversation_scrubber.scrub_line (no-op when inactive). Disk state = what's allowed past the scrubber.
6.5 Audit PDF ↔ lineage ↔ signer#
tools/audit_pdf.py::export_audit_report(run_id, work_dir):
1. Collects artefacts (*_metrics.json, *_card.md, fairness, SHAP, drift).
2. Computes source-bundle SHA-256 (tamper-evident cover).
3. Calls lineage.chain_for_deployment(deployment_id) for the lineage section.
4. Renders PDF via ReportLab.
5. If audit_signer.is_configured() → sign_pdf(pdf_path) → writes .sig + .pem + .signature.json sidecar.
6.6 G15 RTBF ↔ lineage#
erase_subject(subject_id, requested_by, dry_run=False):
1. lineage.subjects_in_dataset(subject_id) → dataset IDs via consent_doc_ref index.
2. Collect downstream models + deployments for the receipt (via models_for_dataset + deployments_for_model).
3. Delete dataset rows (FK cascade: lineage_models.dataset_id → NULL; deployments untouched).
4. Walk conversation JSONL roots; rewrite subject-containing lines to tombstones.
5. Sign receipt (SHA-256 over sorted-JSON payload excluding the signature field).
7. Lifecycle flows#
7.1 Pipeline run (API-triggered)#
POST /api/v1/pipelines { problem_statement, data_path, provider, backend, pipeline_config }
↓ (auth: JWT)
pipelines._start_pipeline_inproc → work_dir created
↓
backends/native_backend.py runs orchestrator.supervisor_loop
↓
ML Director agent → transfer_to_data_team_lead → data_profiler.run_tool(...)
↓ (every tool call → permission engine → guardrails → ToolExecutor)
...continues through data / algorithm / training / evaluation / deployment / quality teams...
↓
Audit PDF generated at end; optionally G11-signed
↓
Run marked complete; dashboard streams final state via WebSocket
7.2 Admin right-to-be-forgotten#
DELETE /api/v1/subjects/cust-xyz (admin JWT required)
↓
require_role(Role.ADMIN) → permission engine check → permission_denials if fail
↓
erase_subject(subject_id=cust-xyz, requested_by=admin_user, dry_run=False)
↓
SHA-256-signed ErasureReceipt returned + 2555-day audit row preserved
7.3 Ship pipeline#
python -m ml_team.cli deploy new hdfc_bank --template=bfsi_baseline
↓ scaffold.new_deployment
python -m ml_team.cli deploy validate hdfc_bank
↓ validator.validate → ValidationReport
python -m ml_team.cli deploy ship hdfc_bank --output=./dist
↓ ship.ship (validates → collects files → builds MANIFEST → tarfile writes)
↓ whitepaper.write_whitepaper
dist/hdfc_bank-v0.1.0/{tarball, MANIFEST.yaml, whitepaper.md}
7.4 Release (CI-triggered)#
git tag -s v0.12.1 -m "..."
git push origin v0.12.1
↓ triggers .github/workflows/release-supply-chain.yml
verify-signed-commits → cyclonedx-bom generate → cosign verify base image
→ tar --exclude-list archive → cosign sign-blob (tarball + SBOM) via GitHub OIDC
→ softprops/action-gh-release uploads 6 assets
→ Rekor transparency log recorded
8. Operational surface#
8.1 CLI (swarm)#
swarm login / logout / whoami / health
swarm features {list, get, set, reset}
swarm pipelines {list, run, status, cancel}
swarm plugins {list, inspect, install}
swarm cron {list, create, run, delete, runs}
swarm batch {list, submit, status, results, resume}
swarm deployments {list, promote, retire} # champion-challenger runtime
swarm deploy {new, validate, ship, whitepaper} # per-customer ship pipeline
8.2 Environment variables#
Full table in README.md § Configuration reference. Key ones:
| Var | Default | Role |
|---|---|---|
ML_TEAM_JWT_SECRET |
ephemeral | HS256 JWT signing |
OPENAI_API_KEY / ANTHROPIC_API_KEY |
— | LLM provider |
SWARM_DEPLOYMENT |
deployments/_dev_scaffold |
Active deployment |
SWARM_KEK |
— | G12/G13 EnvKeyProvider KEK |
ML_TEAM_DB |
ml_team_runs.db |
SQLite path |
ML_TEAM_OIDC_* |
— | OIDC SSO wiring |
SWARM_ENFORCE_TOOL_ALLOWLIST |
1 |
Hard-block vs log-only on allowlist miss |
RATE_LIMIT_ROLE_{VIEWER,OPERATOR,ADMIN}_READS/WRITES |
per-role defaults | G7 per-role rate limits |
8.3 Dashboards + operator-visible surfaces#
/— Pipelines/pipelines/[id]— Conversation tree / trace / cost / timeline/deployments— Champion/challenger/agents— 40-agent roster/transparency— Permission denials + retention + cron + batch/cron— Scheduler/plugins— Marketplace install/knowledge— RAG corpus/settings— Feature flags/docs— In-app documentation browser
9. Test matrix#
1258 tests at v0.12.0. Distribution:
| Area | Count | Examples |
|---|---|---|
| Core permission engine | ~45 | test_permissions.py, test_permission_sources.py, test_profile_permission_sources.py |
| Guardrails (Track 2) | ~230 | 17 (bootstrap) + 28 (G1) + 17 (G2) + 30 (G3) + 29 (G4) + 16 (G5) + 30 (G6) + 20 (G7) + 20 (G10) + 18 (G11) + 21 (G12+G13) + 22 (G14) + 15 (G15) + 20 (G16) + 11 (G17) — full list |
| ML toolkit | ~70 | test_training_tool.py, test_drift.py, test_fairness.py, test_explainability.py |
| Deployment + ship tooling | 24 | test_deploy_cli.py (scaffold + validator + manifest + ship + whitepaper + CLI dispatch) |
| Lib loader + composition | ~70 | test_lib_loader.py, test_agent_composer.py, test_lib_schemas.py |
| Agent runtime + tool executor | ~50 | test_agent_runner.py, test_tool_executor.py |
| API routers | ~90 | One file per router |
| Plugin ecosystem | ~60 | test_plugin_*.py (install drops, shell hooks, commands, agents, compat smoke) |
| BFSI baseline e2e | 11 | test_bfsi_baseline_e2e.py — biased-model → deploy blocked |
| Everything else | rest | Snapshot parity, workflow migration, feature flags, etc. |
Invariants enforced by the test suite:
- Default deployment reproduces pre-refactor behaviour byte-identically
- Every guardrail is a no-op when unconfigured
- Profile DENY at priority 60 beats operator POLICY ALLOW at 50
- Tarball excludes other customers' deployments/
- Audit PDF bundle hash is deterministic given identical inputs
Regression gate: pytest ml_team/tests -q --deselect test_deploy_tools.py::test_package_model_actually_builds_if_docker_present.
10. Data model#
10.1 SQLite tables (production)#
| Table | Purpose | Key columns |
|---|---|---|
users |
Authenticated users + role | username PK, role, password_hash |
runs |
Pipeline runs | run_id PK, status, problem_statement, work_dir, profile_at_creation |
run_events |
Per-run event log | id PK, run_id FK, event_data JSON |
permission_denials |
Every denial decision | ts, run_id, tool_name, source, reason, agent |
model_deployments |
Champion-challenger registry | deployment_id PK, model_name, status, traffic_pct |
shadow_predictions |
Shadow-log for challenger agreement | run_id, model_id, prediction, ground_truth |
approvals |
HITL gates | gate_id PK, gate_type, ttl_seconds, escalated_at |
plugin_installations |
Installed plugins | name PK, version, install_drops_json |
plugin_shell_executions |
Shell-hook audit | ts, plugin, command, exit_code |
| G14 tables (v0.12.0) | ||
datasets |
Dataset lineage | dataset_id PK, consent_doc_ref (G15 lookup) |
lineage_models |
Model lineage | model_id PK, dataset_id FK ON DELETE SET NULL |
lineage_deployments |
Deployment lineage | deployment_id PK, model_id FK ON DELETE CASCADE |
FK enforcement: PRAGMA foreign_keys=ON everywhere. WAL journal mode.
10.2 JSONL files (per-run work_dir)#
{work_dir}/
├── metrics.json (from train_classifier)
├── *_card.md (from generate_model_card)
├── fairness.json (from audit_fairness)
├── drift.json (from detect_drift)
├── shap.json (from explain_model)
├── approvals.json (HITL gate state)
├── audit_report_<run_id>.pdf (from export_audit_report)
│ audit_report_<run_id>.pdf.sig (G11 signature)
│ audit_report_<run_id>.pdf.pem (G11 cert if keyless)
│ audit_report_<run_id>.pdf.signature.json (G11 receipt)
└── conversations/
├── _index.json (agent hierarchy + timing)
├── ml_director.jsonl (one ConversationMessage per line)
├── data_team_lead.jsonl
└── ...
Conversation JSONL format (per line):
{
"message_id": "uuid",
"agent_name": "ml_director",
"parent_agent": "",
"run_id": "run_abc123",
"role": "user|assistant|tool|system|tool_denial",
"content": "…",
"tool_name": "",
"sequence_number": 0,
"metadata": {...},
"_redacted": true // present only when G5 scrubbed this line
}
10.3 SQL-grep audit queries#
-- Who denied what, in the last 24h
SELECT datetime(ts, 'unixepoch'), source, tool_name, reason, agent, run_id
FROM permission_denials
WHERE ts > unixepoch() - 86400
ORDER BY ts DESC;
-- Every G1 egress denial by target host
SELECT tool_name, reason, COUNT(*)
FROM permission_denials
WHERE source = 'egress_allowlist'
GROUP BY tool_name, reason ORDER BY 3 DESC;
-- Datasets tagged with a specific consent doc (G15 lookup)
SELECT dataset_id, name, row_count, created_at
FROM datasets WHERE consent_doc_ref = 'customer-alice';
-- Complete lineage chain for a deployment
SELECT d.deployment_id, d.environment, d.deployed_at,
m.model_id, m.algorithm, m.weights_sha256,
ds.dataset_id, ds.name, ds.sha256
FROM lineage_deployments d
LEFT JOIN lineage_models m ON m.model_id = d.model_id
LEFT JOIN datasets ds ON ds.dataset_id = m.dataset_id
WHERE d.deployment_id = 'deploy_abc123';
11. Glossary#
| Term | Meaning |
|---|---|
| Agent | One LLM persona with a system prompt + tool allowlist + tier. Defined in lib/agents/base/<id>/agent.yaml, loaded lazily by the agent_defs.py PEP-562 shim. |
| Allowlist (tool) | Per-agent tool_set — only these tools are dispatchable. Denials recorded in permission_denials. |
| Bootstrap (guardrail) | Auto-configure 6 configurable guardrails from deployment guardrail_configs at API boot. |
| BYOK | Bring Your Own Key — customer's KMS holds the KEK; swarm only sees wrapped DEKs. |
| CAIQ | Cloud Security Alliance Consensus Assessments Initiative Questionnaire. |
| Circuit breaker | G10 delegation-loop detector (depth ≤5, fan-out ≤50). |
| Compliance gate | Runtime gate emitting DENY via compliance_gate_source at priority 55. |
| Composition | Agent-YAML extends + overlay pattern. Customer agents extend lib bases. |
| Cosign keyless | Sigstore signing via GitHub OIDC; ephemeral cert + Rekor log entry. |
| DEK | Data Encryption Key — per-call AES-256 key, wrapped by KEK. |
| Deployment (runtime) | A customer instance loaded via SWARM_DEPLOYMENT=<path>. |
| Deployment (G14/K8s) | A model in production (lineage_deployments row). |
| DPO | Data Protection Officer — required under DPDP Act § 10(8); swarm founder self-designates until team growth. |
| ErasureReceipt | Signed dict returned by G15; SHA-256 over sorted-JSON payload. |
| Feature flag | 3-tier knob (INVARIANT / FLAG / USER_OVERRIDE) registry. |
| HITL | Human-in-the-Loop — 6 approval-gate types + TTL + escalation (G16). |
| Hook event | 10 integration points: SESSION_START, PRE/POST_TOOL, PRE/POST_COMPACTION, PRE_LLM, POST_LLM, STORAGE_WRITE, LLM_CALL_WRAPPER, AGENT_DELEGATE. |
| Invariant-DENY floor | Profile DENY rules at priority 60. Operator POLICY ALLOW at 50 cannot override. |
| KEK | Key Encryption Key — customer-owned, held in their KMS, wraps DEKs. |
| Library shelf | /lib/ — versioned reusable building blocks (agents, tools, workflows, guardrails, templates). |
| Lineage chain | datasets → models → deployments joined view; embedded in audit PDFs. |
| MANIFEST.yaml | Per-ship record of pinned lib versions + config SHA-256 + build commit + timestamp. |
| MCP | Model Context Protocol — JSON-RPC tool-server standard. |
| Permission source | Pure function (ctx) → list[PermissionRule] registered with the engine. 8 default sources in v0.12.0. |
| Priority band | RBAC 10 / ALLOWLIST 20 / FLAG 30 / HITL 40 / PROFILE_ASK 45 / POLICY 50 / COMPLIANCE 55 / PROFILE_DENY 60. |
| Profile | Customer-compliance baseline loaded via based_on: lib/templates/<id>@vX.Y.Z — permission rules + compliance gates + retention. |
| Rekor | Sigstore transparency log — append-only, censorship-resistant, customer-verifiable. |
| Run | A pipeline execution from problem-statement to audit-PDF. Identified by run_id. |
| SBOM | Software Bill of Materials — CycloneDX 1.5 JSON generated on every release. |
| Scrubber | G5 JSONL-line PII redactor at ConversationStore._flush_locked. |
| SIG | G17 Cosign signature file alongside a release asset. |
| STRIDE | Threat-model methodology: Spoofing, Tampering, Repudiation, Information disclosure, Denial of service, Elevation of privilege. |
| Sutra | RBI FREE-AI pillar. 7 total; swarm maps to Pillars 2, 3, 5, 6 in full. |
| Tier (feature flag) | INVARIANT (immutable) / FLAG (operator-togglable) / USER_OVERRIDE (per-call). |
| Tier (guardrail) | invariant (priority 60) / flag (priority 45-50) / off. |
| Tombstone | G15 JSONL-line replacement when a subject is erased; valid JSON, preserves order + count. |
| Track 1 / Track 2 | 6-phase platform refactor (P0-P5) + 9-wave guardrails subsystem. Both complete in v0.12.0. |
| Work dir | pipeline_runs/<run_id>/ — per-run scratch with metrics + conversations + audit PDF. |
Last updated: 2026-04-22 (v0.12.0) Maintained by: TheAiSingularity · security@theaisingularity.org Report drift: open an issue or email. Docs must match code; drift is a bug.