Permissions & audit#
Every agent action, every tool call, every HTTP endpoint, every HITL approval — all resolve through one unified permission engine. Every denial is attributed to its source rule and lands in a SQL-queryable audit table.
This is the single most important concept for regulated-industry buyers. Read this page carefully.
The resolution pipeline#
incoming request / tool call
│
▼
┌─────────────────────────────┐
│ collect rules from 5 │
│ default sources: │
│ · RBAC (user role) │
│ · agent allowlist │
│ · feature flags │
│ · HITL approval gates │
│ · YAML policy │
└──────────────┬──────────────┘
│
▼
┌─────────────────────────────┐
│ ALLOW wins over DENY │
│ DENY wins over ASK │
│ ASK wins over default │
│ (higher priority wins │
│ within a tier) │
└──────────────┬──────────────┘
│
┌──────────┼──────────┐
▼ ▼ ▼
ALLOW DENY ASK
(runs) (blocked, (HITL gate,
audited) operator
prompt)
Implemented at ml_team/core/permissions.py. reset_sources() clears; mark_uninitialized() resets lazy-init — both needed in test fixtures.
The five rule sources#
| Source | Example rule |
|---|---|
| RBAC | require_role(Role.ADMIN) translates into deny for role < admin on admin endpoints |
| Agent allowlist | The per-agent tools=[...] list translates into allow for listed tools, deny for all others |
| Feature flags | Flag auto_fairness_audit=off translates into deny on the fairness_audit tool path |
| HITL gates | Tool decorated with require_approval("deploy") translates into ask until operator approves |
| YAML policy | config/permission_policies.yaml — operator-authored rules. Empty default; BFSI customers ship their own |
Each source is a pure (context) -> list[PermissionRule] function — swappable, testable, plugin-addable.
Every denial persists#
A new SQLite table, permission_denials, records every ASK-turned-deny and every outright DENY:
CREATE TABLE permission_denials (
id INTEGER PRIMARY KEY,
tool_call_id TEXT,
tool_name TEXT NOT NULL,
agent_name TEXT,
arguments_json TEXT,
rule_source TEXT NOT NULL, -- 'rbac:admin', 'agent_allowlist', 'hitl:deploy', ...
reason TEXT,
user_role TEXT,
http_method TEXT,
http_path TEXT,
timestamp REAL NOT NULL
);
Queryable via:
# All denials in the last 24h
curl http://localhost:8000/api/v1/permissions/denials?since=86400
# All denials attributed to a specific agent
curl http://localhost:8000/api/v1/permissions/denials?agent=data_cleaner
# Denials that resolved because of a YAML policy rule
curl http://localhost:8000/api/v1/permissions/denials?rule_source=policy
The BFSI-auditor question#
The one that sold this design:
"What did agent X try to do that was blocked, and why?"
Before the unified engine (W7-1), answering required cross-referencing four systems: RBAC role guards, per-agent tool allowlists, ApprovalRequired exceptions, and ~22 feature flags. Fragile. Not defensible.
After: one SQL query, one table, one rule-source attribution per row. The shape regulators already understand.
The invariant#
The audit_trail_security_events flag is an INVARIANT (cannot be disabled at runtime). Denials are always persisted, regardless of any other observability configuration. This is the contract to BFSI / healthcare / EU AI Act auditors.
HITL approval gates#
Some tools are sensitive enough to warrant an explicit human approval — deploy_serving, promote_challenger, export_raw_data, etc. Mark them with require_approval(gate_type):
@tool(schema=...)
@require_approval("deploy")
def deploy_serving(model_id: str, env: str) -> str:
...
The first call raises ApprovalRequired. The agent runtime serializes the pending call to the ApprovalStore. An operator approves via REST or dashboard. On the next agent turn, the call re-runs and succeeds.
Feature flag registry#
The system has three tiers:
- Invariant — cannot be disabled (e.g.
audit_trail_security_events) - Flag — production-facing, stable (e.g.
cron_scheduler,batch_runner) - Experiment — opt-in, can change or disappear (e.g.
plugin_shell_hooks_enabled,hooks_enabled,evaluator_grading)
Resolution order: runtime override → env var → alias → declared default. Check at runtime:
Full list + descriptions: swarm features list or the /transparency page.
Next#
- Hooks & ops — lifecycle callbacks that can mutate / block tool calls
- Compliance profiles — how a profile auto-loads YAML policies
- Reading the audit PDF