Skip to content

Reading the audit PDF#

Section-by-section walkthrough. Every pipeline run with a compliance profile produces pipeline_runs/<run_id>/audit/audit_report.pdf. This page tells you what each section means and how to verify integrity.

Structure at a glance#

Section Pages Who reads this?
1. Cover 1 Everyone
2. Executive summary 1 CRO, Model Risk Committee
3. Model card 2-3 ML team, compliance
4. Data governance 1-2 Data governance, compliance
5. Fairness audit 1-2 Compliance, model risk
6. Explainability 2-4 Model risk, product
7. Drift baseline 1 MLOps, compliance
8. Agent trail 2-4 Auditors
9. Permission denials 1-2 Security, auditors
10. Tamper-evident manifest 1 Auditors

Total: 12-24 pages depending on model complexity and profile.

Section 1 — Cover#

Contains:

  • Run ID (UUID, human-readable short form)
  • Pipeline template + compliance profile used
  • Model name + version
  • Pipeline start + end timestamps (ISO-8601 with timezone)
  • Compliance profile + version
  • swarm version
  • SHA-256 of the PDF itself (pinned in the footer)
  • QR code linking to audit verification URL

Section 2 — Executive summary#

One paragraph, non-technical. Example:

This audit report covers the training and initial deployment of model fraud_classifier_v2 by the swarm platform on 2026-04-15. The model was trained on 50,000 synthetic transaction records using a logistic regression algorithm with elastic-net regularization, achieving 96.7% accuracy and 0.94 ROC-AUC on the held-out test set. Fairness metrics across the protected attribute cardholder_region showed a demographic-parity delta of 0.021, well within the 0.05 threshold. SHAP explainability analysis was generated. The model is recommended for shadow-traffic deployment pending CRO sign-off.

One paragraph. Readable by a CRO.

Section 3 — Model card#

From reports/model_card.md. Full sections:

  • Intended use — primary use case + foreseeable misuse
  • Model details — algorithm, framework, version
  • Training data — source, size, statistics, date, hash
  • Performance metrics — accuracy, precision/recall, F1, AUC, per-class
  • Evaluation data — how the held-out set was constructed; representativeness
  • Ethical considerations — protected attributes considered; known limitations
  • Caveats + recommendations — what this model should NOT be used for

For RBI / EU AI Act profiles: additional sections for Sutra 7 / Article 13 requirements.

Section 4 — Data governance#

  • Data source lineagedataset_path, SHA-256 of the file at ingest, upstream source if applicable
  • Columns — name, type, cardinality, missing-ness
  • Protected attributes — flagged with legal basis for use
  • Data quality score — from data_validator agent
  • Transformations applied — every step from raw data to features, in order
  • PII/PHI handling — what was masked, hashed, or excluded (HIPAA profile)

Section 5 — Fairness audit#

From reports/fairness_audit.json. Rendered as:

  • Protected attribute table — per-group accuracy, precision, recall, F1
  • Metric deltas — demographic parity, equalized odds, equal opportunity; flagged green/yellow/red vs thresholds
  • Mitigation actions taken — if any (e.g., reweighting, threshold adjustment per group)
  • Residual disparity justification — if any metric is out of threshold, why it was accepted (written by the approver)

For RBI profile (Sutra 4): flags requiring Model Risk Committee sign-off.

Section 6 — Explainability#

From reports/shap_explanation.json. Includes:

  • Global feature importance chart — top-20 features by mean absolute SHAP value
  • Dependence plots — for the top-5 features
  • Per-class importance — if multi-class classification
  • Surrogate model — if the primary model is black-box (gradient boosting), a trained decision tree surrogate approximates it for regulator comprehension
  • Local explanation example — one prediction walked through (anonymized)

Section 7 — Drift baseline#

Pinned feature distributions for future monitoring:

  • Baseline window — date range used
  • Per-feature statistics — mean, std, quantiles, unique-count if categorical
  • Reference hashbaseline_run_id for downstream drift checks to compare against

Section 8 — Agent trail#

Compressed narrative of the pipeline run:

  • Agents invoked — order + role
  • Major decisions — what each agent decided + why
  • Tool calls per agent — count + categorical breakdown
  • Approval gates — who approved, when, with comment
  • LLM call summary — provider, token counts, cost

Not the full conversation (that's in the evidence bundle conversations/*.jsonl). The agent trail is the readable summary.

Section 9 — Permission denials#

Every DENY decision during the run. Columns:

Tool Agent Rule source Reason Timestamp
export_raw_data data_cleaner policy:hipaa HIPAA: raw PHI export requires de-id workflow 2026-04-15 14:22:03
deploy_serving rbac:operator Operator role cannot deploy to prod 2026-04-15 14:25:17

Usually short. A clean run has 0-2 entries.

Section 10 — Tamper-evident manifest#

The most important page for auditors:

  • PDF SHA-256 — hash of this PDF document
  • Manifest SHA-256 — hash of the bundled audit_report.sig file
  • Artefact manifest — per-file SHA-256:
  • model.joblib
  • model_card.md
  • fairness_audit.json
  • shap_explanation.json
  • drift_baseline.json
  • run_events.jsonl
  • conversations/*.jsonl
  • swarm version + commit SHA — the code that produced this
  • Certificate chain — platform signing key → fingerprint

Auditor verification:

swarm audit verify audit_report.pdf

Output:

Verifying audit_report.pdf...
  Manifest found: audit_report.sig
  PDF SHA-256:   ok (3f8a2e...)
  model.joblib:  ok (a12b...)
  model_card.md: ok (c45d...)
  fairness_audit.json: ok (e67f...)
  shap_explanation.json: ok (g89h...)
  run_events.jsonl: ok (i12j...)
  conversations: 6 files, all ok

Verdict: UNTAMPERED.
Audited at 2026-04-15T14:35:02Z by run_id=7f8e9a2b, swarm=v0.11.0.

If any file modified since audit: verify fails with the specific mismatch. That's what regulators look for.

Profile-specific sections#

RBI FREE-AI#

Adds Section 3.5 — Sutra compliance matrix — a 7-row table mapping each Sutra to the evidence artefact for this pipeline run.

HIPAA#

Adds Section 4.5 — De-identification audit — § 164.514 Safe Harbor or Expert Determination evidence.

EU AI Act#

Replaces Section 3 with the full Annex IV technical documentation (Article 11): 1. General description 2. Detailed description 3. Monitoring + testing results 4. Risk management system 5. Post-market monitoring plan

Customizing the template#

The audit PDF template is a YAML file + a ReportLab renderer. To customize for internal governance:

cp ml_team/tools/audit_templates/rbi_free_ai.yaml \
   ml_team/tools/audit_templates/acme_bank_internal.yaml
# Edit — add branding, internal sections, different section ordering

swarm audit generate \
  --run-id 7f8e9a2b \
  --template acme_bank_internal \
  --output acme_internal_audit.pdf

Templates use a simple DSL — sections, fields, Markdown + Jinja2. See ml_team/tools/audit_pdf.py for the schema.

When something's off#

If the auditor flags an issue:

  1. Tampering suspectedswarm audit verify will flag; preserve the file; contact us at security@theaisingularity.org
  2. Missing artefact — pipeline didn't produce a required output. Re-run with the correct profile. See How-to: Generate an audit PDF
  3. Content question — "why did algorithm_selector pick logistic regression?" is answered in the agent trail (Section 8) pointing to the reasoning in conversations/algorithm_selector.jsonl

Evidence bundle vs PDF#

The PDF is the narrative. The evidence bundle (swarm audit bundle --run-id ...) is the raw artefacts:

evidence_bundle.tar.gz
├── audit_report.pdf        # the narrative
├── audit_report.sig        # the manifest
├── MANIFEST.txt            # human-readable summary
├── README.md               # how to use this bundle
├── model.joblib
├── reports/
│   ├── model_card.md
│   ├── fairness_audit.json
│   └── shap_explanation.json
├── run_events.jsonl
└── conversations/
    └── (all agent journals)

Regulators who want everything get the bundle. Those who want a readable summary get the PDF.

Retention#

Audit PDFs retained per jurisdiction default:

  • RBI: 7 years
  • HIPAA: 6 years (per § 164.316)
  • EU AI Act: 10 years (Art. 12)
  • SOC 2: 1 year minimum; 3 years recommended

Override via SWARM_RETENTION_AUDIT_PDF_DAYS.

Next#