Retention#
How long each artefact type lives, and how the retention daemon enforces it.
Artefact classes + default TTLs#
| Artefact | Location | Default TTL | Env var |
|---|---|---|---|
| Conversation JSONL | pipeline_runs/<id>/conversations/*.jsonl |
90 days | SWARM_RETENTION_CONVERSATION_JSONL_DAYS |
| Run events | run_events table |
365 days | SWARM_RETENTION_RUN_EVENTS_DAYS |
| Permission denials | permission_denials table |
never pruned (invariant) | — |
| Shadow predictions | shadow_predictions table |
30 days | SWARM_RETENTION_SHADOW_PREDICTIONS_DAYS |
| Audit PDFs | pipeline_runs/<id>/audit/*.pdf |
2555 days (7 years) | SWARM_RETENTION_AUDIT_PDF_DAYS |
| Model artefacts | pipeline_runs/<id>/models/ |
365 days | SWARM_RETENTION_MODEL_DAYS |
| Batch results | batch_runs/<id>/results.jsonl |
90 days | SWARM_RETENTION_BATCH_RESULTS_DAYS |
| Cron output logs | ~/.swarm/cron/output/*.log |
30 days | SWARM_RETENTION_CRON_OUTPUT_DAYS |
| Metrics (Prometheus) | external | 30-90 days (Prometheus config) | — |
| Traces (OTel) | external | tracing backend's retention | — |
| Logs | external | log store's retention | — |
Invariants#
Two artefact classes are never pruned regardless of TTL config:
permission_denialstable — audit integrity requirement. BFSI / HIPAA / EU AI Act all require this.run_eventsentries of kindpermission_decisionorapproval_granted— even if row-level retention prunes other events
Attempting to disable retention on these (e.g. via SWARM_RETENTION_ENABLED=false) will still preserve them. The audit_trail_security_events feature flag is invariant tier (cannot be disabled at runtime).
Per-jurisdiction recommended overrides#
The retention daemon#
A background process (part of the API lifespan) sweeps every 24 hours.
- First sweep: 30 seconds after API startup (to make sure DB is up)
- Subsequent sweeps: every
SWARM_RETENTION_DAEMON_INTERVAL_HOURS(default 24) - Per-artefact: TTL compared to
created_at(conversation JSONL) or file mtime (PDFs); pruned if older
What gets pruned#
For files: file + any accompanying .sig manifest + parent directory if empty.
For DB rows: soft-delete first (deleted_at set), hard-delete after 7-day grace period. Soft-deleted rows still queryable with ?include_deleted=1.
What gets emitted#
{
"event": "retention.sweep.complete",
"duration_ms": 2318,
"pruned": {
"conversation_jsonl": 142,
"run_events": 8491,
"shadow_predictions": 12032,
"audit_pdfs": 0,
"models": 3,
"batch_results": 17,
"cron_output": 245
},
"errors": []
}
Prometheus metric: swarm_retention_pruned_total{artefact_kind}.
Opting out (individual artefact)#
Sometimes you want to freeze a specific pipeline run — e.g., during a regulator investigation:
Creates a frozen flag on the run. Retention daemon skips frozen runs. Unfreeze:
Record of freeze + unfreeze lands in run_events for audit.
Disabling retention entirely#
Retention daemon doesn't start. Files + rows grow unbounded. Not recommended in production.
Even with this set to false, permission_denials + audit_trail_security_events-kind run_events remain retained (invariant).
Archive to long-term storage#
For artefacts past their operational TTL but retained for compliance, swarm supports archival to cold storage.
Object storage tier transitions#
With storage.backend=s3:
# values.yaml
storage:
lifecycle:
- prefix: "audit/"
transitions:
- days: 90
class: GLACIER_IR # instant-retrieval Glacier
- days: 365
class: DEEP_ARCHIVE # Glacier Deep Archive
expiration_days: 2555 # 7 years
- prefix: "conversations/"
expiration_days: 365 # simple delete; no archival tier
Equivalent for GCS / Azure Blob.
Retrieval from cold storage#
# Request restore (may take hours for Deep Archive)
swarm audit restore --run-id 7f8e9a2b --priority standard
# → restore job queued. Check status:
swarm audit restore-status 7f8e9a2b
Backup-driven retention#
If you rely on DB backups for long-term retention (instead of in-DB retention), configure:
postgres:
backup:
enabled: true
schedule: "0 2 * * *" # 02:00 nightly
retention:
daily: 7
weekly: 4
monthly: 12
yearly: 10 # 10 years of yearly backups
destination: s3://acme-swarm-backups/postgres/
And set shorter in-DB retention:
Trade-off: faster DB, slower regulator-inspection workflow (must restore from backup).
GDPR right-to-be-forgotten#
Data subjects can request deletion. swarm's retention cannot override this — compliance wins.
This:
- Searches all run_events + conversation journals + artefact metadata for the principal
- Pseudonymizes records that must be preserved (e.g., audit trail)
- Fully deletes records that can be safely removed (conversation JSONL)
- Emits a compliance certificate to give the requester
Logs the deletion in run_events (which itself is preserved — paradox handled by pseudonymization).
Monitoring retention#
Alerts to consider:
swarm_retention_pruned_total{artefact_kind="audit_pdfs"} > 0— something unexpectedly pruned audit PDFs- Retention sweep failed — triggers if error in the retention daemon
- Storage approaching full — cloud-provider-level alert
Next#
- Backup & restore — companion to retention for long-term preservation
- Compliance profiles — per-profile retention defaults
- Configuration — every SWARM_RETENTION_* env var