Tutorial 1 — End-to-end fraud classifier (BFSI)¶

A realistic BFSI scenario from scratch. We'll generate a synthetic credit-card transactions dataset, run it through the default_ml_pipeline with the rbi_free_ai compliance profile, shadow-test against a champion, promote, and inspect the regulator-format audit PDF.

Time: ~30 minutes interactive (30 seconds if you just read the rendered output).

Assumes: swarm running at http://localhost:8000, ANTHROPIC_API_KEY or OPENAI_API_KEY set. See Quickstart to get there.

0. Setup¶

We'll use the swarm CLI + REST from inside Python. swarm the CLI is installed via pip install -e ml_team/.

In [ ]:

Copied!





import os
import time
from pathlib import Path

import numpy as np
import pandas as pd
from sklearn.datasets import make_classification

SWARM_API = os.environ.get('SWARM_API', 'http://localhost:8000')
SWARM_TOKEN = os.environ['SWARM_TOKEN']  # see: swarm login

print(f'Using swarm at {SWARM_API}')
import os
import time
from pathlib import Path

import numpy as np
import pandas as pd
from sklearn.datasets import make_classification

SWARM_API = os.environ.get('SWARM_API', 'http://localhost:8000')
SWARM_TOKEN = os.environ['SWARM_TOKEN']  # see: swarm login

print(f'Using swarm at {SWARM_API}')

1. Generate a realistic-looking fraud dataset¶

Rather than hand-waving with sklearn defaults, we'll build something that looks like real credit-card data — skewed amounts, time-of-day patterns, merchant category fields, and a ~2% fraud prior (realistic BFSI rate).

This is synthetic and will not leave your machine.

In [ ]:

Copied!





np.random.seed(42)
n_samples = 50_000

# Base numerical features via sklearn — gives us signal without leaking label perfectly
X, y = make_classification(
    n_samples=n_samples,
    n_features=15,
    n_informative=8,
    n_redundant=3,
    weights=[0.98, 0.02],  # 2% fraud
    class_sep=0.8,
    flip_y=0.01,            # a little label noise
    random_state=42,
)

# Dress up the DataFrame to look like real BFSI data
df = pd.DataFrame(X, columns=[
    'amount_zscore', 'velocity_zscore', 'merchant_risk',
    'hour_of_day', 'day_of_week', 'card_age_months',
    'prev_7d_txn_count', 'prev_7d_amount_sum',
    'foreign_txn_flag', 'cnp_flag',  # card-not-present
    'geo_risk_score', 'device_risk_score',
    'recipient_new_flag', 'pin_tries', 'chargeback_prev_90d'
])
df['is_fraud'] = y

# Add a protected attribute to demonstrate fairness audit (synthetic!)
df['cardholder_region'] = np.random.choice(
    ['north', 'south', 'east', 'west', 'metro'],
    size=n_samples,
    p=[0.22, 0.22, 0.15, 0.15, 0.26],
)

print(f'Dataset: {len(df):,} rows, fraud rate: {df.is_fraud.mean():.2%}')
df.head()
np.random.seed(42)
n_samples = 50_000

# Base numerical features via sklearn — gives us signal without leaking label perfectly
X, y = make_classification(
    n_samples=n_samples,
    n_features=15,
    n_informative=8,
    n_redundant=3,
    weights=[0.98, 0.02],  # 2% fraud
    class_sep=0.8,
    flip_y=0.01,            # a little label noise
    random_state=42,
)

# Dress up the DataFrame to look like real BFSI data
df = pd.DataFrame(X, columns=[
    'amount_zscore', 'velocity_zscore', 'merchant_risk',
    'hour_of_day', 'day_of_week', 'card_age_months',
    'prev_7d_txn_count', 'prev_7d_amount_sum',
    'foreign_txn_flag', 'cnp_flag',  # card-not-present
    'geo_risk_score', 'device_risk_score',
    'recipient_new_flag', 'pin_tries', 'chargeback_prev_90d'
])
df['is_fraud'] = y

# Add a protected attribute to demonstrate fairness audit (synthetic!)
df['cardholder_region'] = np.random.choice(
    ['north', 'south', 'east', 'west', 'metro'],
    size=n_samples,
    p=[0.22, 0.22, 0.15, 0.15, 0.26],
)

print(f'Dataset: {len(df):,} rows, fraud rate: {df.is_fraud.mean():.2%}')
df.head()

In [ ]:

Copied!





# Save to the swarm data directory
data_path = Path('ml_team/data/fraud_synthetic.csv')
data_path.parent.mkdir(exist_ok=True, parents=True)
df.to_csv(data_path, index=False)
print(f'Saved to {data_path}')
# Save to the swarm data directory
data_path = Path('ml_team/data/fraud_synthetic.csv')
data_path.parent.mkdir(exist_ok=True, parents=True)
df.to_csv(data_path, index=False)
print(f'Saved to {data_path}')

2. Kick off the pipeline with RBI FREE-AI profile¶

We request rbi_free_ai at pipeline start so the fairness auditor + SHAP explainer + audit PDF template activate.

In [ ]:

Copied!





import httpx

resp = httpx.post(
    f'{SWARM_API}/api/v1/pipelines',
    headers={'Authorization': f'Bearer {SWARM_TOKEN}'},
    json={
        'problem_statement': (
            'Classify credit-card transactions as fraud (1) or legitimate (0). '
            'Target column is is_fraud. Protected attribute is cardholder_region. '
            'Optimise for PR-AUC; we care about catching fraud without frustrating legit users.'
        ),
        'dataset_path': 'fraud_synthetic.csv',
        'template': 'default_ml_pipeline',
        'compliance_profile': 'rbi_free_ai',
        'name': 'fraud-v1-tutorial',
    },
    timeout=30,
)
resp.raise_for_status()
run_id = resp.json()['run_id']
print(f'Pipeline started: run_id={run_id}')
import httpx

resp = httpx.post(
    f'{SWARM_API}/api/v1/pipelines',
    headers={'Authorization': f'Bearer {SWARM_TOKEN}'},
    json={
        'problem_statement': (
            'Classify credit-card transactions as fraud (1) or legitimate (0). '
            'Target column is is_fraud. Protected attribute is cardholder_region. '
            'Optimise for PR-AUC; we care about catching fraud without frustrating legit users.'
        ),
        'dataset_path': 'fraud_synthetic.csv',
        'template': 'default_ml_pipeline',
        'compliance_profile': 'rbi_free_ai',
        'name': 'fraud-v1-tutorial',
    },
    timeout=30,
)
resp.raise_for_status()
run_id = resp.json()['run_id']
print(f'Pipeline started: run_id={run_id}')

3. Watch it run¶

Poll for status. Expect ~15-30 minutes on the default_ml_pipeline — this does hyperparam tuning, cross-validation, fairness audit, SHAP, and model-card generation.

In [ ]:

Copied!





while True:
    status = httpx.get(
        f'{SWARM_API}/api/v1/pipelines/{run_id}',
        headers={'Authorization': f'Bearer {SWARM_TOKEN}'},
    ).json()
    phase = status.get('phase', '?')
    agent = status.get('active_agent', '?')
    print(f'{time.strftime("%H:%M:%S")}  phase={phase}  agent={agent}')
    if status.get('state') in {'completed', 'failed', 'cancelled'}:
        break
    time.sleep(30)

print(f'\nFinal state: {status["state"]}')
while True:
    status = httpx.get(
        f'{SWARM_API}/api/v1/pipelines/{run_id}',
        headers={'Authorization': f'Bearer {SWARM_TOKEN}'},
    ).json()
    phase = status.get('phase', '?')
    agent = status.get('active_agent', '?')
    print(f'{time.strftime("%H:%M:%S")}  phase={phase}  agent={agent}')
    if status.get('state') in {'completed', 'failed', 'cancelled'}:
        break
    time.sleep(30)

print(f'\nFinal state: {status["state"]}')

4. Inspect the outputs¶

The run_dir contains model artefacts, per-agent JSONL journals, compliance reports, and — because we enabled rbi_free_ai — an audit PDF.

In [ ]:

Copied!





run_dir = Path(f'pipeline_runs/{run_id}')
for p in sorted(run_dir.rglob('*')):
    if p.is_file():
        rel = p.relative_to(run_dir)
        size_kb = p.stat().st_size / 1024
        print(f'{size_kb:>8.1f} KB  {rel}')
run_dir = Path(f'pipeline_runs/{run_id}')
for p in sorted(run_dir.rglob('*')):
    if p.is_file():
        rel = p.relative_to(run_dir)
        size_kb = p.stat().st_size / 1024
        print(f'{size_kb:>8.1f} KB  {rel}')

In [ ]:

Copied!

# Read the model card
print((run_dir / 'reports' / 'model_card.md').read_text()[:3000])
# Read the model card
print((run_dir / 'reports' / 'model_card.md').read_text()[:3000])

In [ ]:

Copied!





# Check the fairness audit
import json
fairness = json.loads((run_dir / 'reports' / 'fairness_audit.json').read_text())
print(json.dumps(fairness, indent=2)[:2000])
# Check the fairness audit
import json
fairness = json.loads((run_dir / 'reports' / 'fairness_audit.json').read_text())
print(json.dumps(fairness, indent=2)[:2000])

5. Inspect the audit PDF¶

Regulator-format PDF in audit/audit_report.pdf.

In [ ]:

Copied!

# Open in the system PDF viewer
import subprocess
subprocess.run(['open', str(run_dir / 'audit' / 'audit_report.pdf')])
# Open in the system PDF viewer
import subprocess
subprocess.run(['open', str(run_dir / 'audit' / 'audit_report.pdf')])

See Reading the audit PDF for a section-by-section walkthrough.

{: loading=lazy }

6. Deploy — shadow traffic + champion-challenger¶

Package the model into a container, register as a challenger (there's no champion yet; first deploy becomes champion automatically):

In [ ]:

Copied!





package_resp = httpx.post(
    f'{SWARM_API}/api/v1/deployments/package',
    headers={'Authorization': f'Bearer {SWARM_TOKEN}'},
    json={'run_id': run_id, 'name': 'fraud_classifier', 'version': 'v1'},
    timeout=180,  # Docker builds take a bit
)
package_resp.raise_for_status()
print(package_resp.json())
package_resp = httpx.post(
    f'{SWARM_API}/api/v1/deployments/package',
    headers={'Authorization': f'Bearer {SWARM_TOKEN}'},
    json={'run_id': run_id, 'name': 'fraud_classifier', 'version': 'v1'},
    timeout=180,  # Docker builds take a bit
)
package_resp.raise_for_status()
print(package_resp.json())

7. Iterate — train a challenger and compare¶

Train a second model with a different template (e.g. fast_prototype with CatBoost instead of XGBoost). Package as v2. Shadow against v1 for 24 hours. Promote if it wins.

# Pipeline v2
swarm pipelines run \
  --problem "...(same problem statement)..." \
  --dataset fraud_synthetic.csv \
  --template fast_prototype \
  --compliance rbi_free_ai \
  --name fraud-v2

# Package v2
swarm deployments package --run-id <v2_run_id> --name fraud_classifier --version v2

# Shadow v2 against champion v1
swarm deployments shadow-start \
  --model fraud_classifier --challenger v2 --champion v1 \
  --sample-rate 0.1 --duration 24h

# After 24h: compare
swarm deployments compare --model fraud_classifier --champion v1 --challenger v2

# If challenger wins, promote (HITL gate)
swarm deployments promote --model fraud_classifier --challenger v2

8. Summary — what you just got¶

A trained fraud classifier — joblib model in pipeline_runs/<id>/models/
A model card — human-readable, fit for technical handoff
A fairness audit — demographic parity + equalized odds on cardholder_region
SHAP explanations — global + per-prediction feature importances
A drift baseline — pinned for future monitoring
A regulator-format audit PDF — tamper-evident, ready for RBI / internal audit
Per-agent conversation logs — full trail of who did what
A deployed container — ready for shadow traffic

All in one pipeline run. The alternative (by-hand):

Week 1: data profiling + algorithm selection spreadsheet
Week 2: training loop + hyperparam search + evaluation
Week 3: fairness audit + SHAP + document writing
Week 4: compliance review with CRO's team
Week 5: deploy + shadow setup

That's the pitch.

Next¶

Drift investigation tutorial — what to do when this model's nightly drift check fires in 3 months
Plugin authoring tutorial — extend swarm with your own surfaces
Compliance: RBI FREE-AI — the regulatory context