Skip to content

Schedule a drift check#

Goal: create a cron job that runs detect_drift nightly against your deployed model, alerts when drift exceeds a threshold, and rolls up results into the audit PDF. Time: 10 minutes.

Why drift checks matter#

Feature distributions shift over time: changing customer behaviour, upstream data pipeline bugs, concept drift (the relationship between features and outcomes). A model trained on 2025 data may silently degrade on April 2026 traffic.

BFSI's RBI FREE-AI Sutra 3 (Safety & robustness) mandates drift monitoring.

1. Establish a baseline#

A drift check compares against a baseline — typically the training-time feature distribution. Set the baseline on an existing run:

swarm pipelines run \
  --problem "Establish drift baseline for fraud_v2" \
  --dataset fraud_train_q1_2026.csv \
  --template fast_prototype \
  --name "drift-baseline-q1-2026"

Pin the baseline:

swarm deployments set-baseline \
  --model fraud \
  --version v2 \
  --run-id <that_run_id>

Baseline metadata:

baseline_run_id: 3f2e1a8b
baseline_date: 2026-04-15
baseline_features:
  - amount: mean=450.21 std=1280.3 min=0.5 max=50000
  - merchant_category: 12 unique values
  - time_of_day: uniform-ish
  ...

2. Create the drift-check cron job#

swarm cron create \
  --name "fraud_drift_nightly" \
  --schedule "0 3 * * *" \
  --task drift_check \
  --config '{
    "model": "fraud",
    "version": "v2",
    "window_hours": 24,
    "alert_threshold": 0.15,
    "alert_email": "ml-team@yourorg.com"
  }'

Breakdown: - schedule: "0 3 * * *" — 03:00 every day - window_hours: 24 — use the last 24h of prediction logs - alert_threshold: 0.15 — alert if any feature's KS statistic vs baseline exceeds 0.15 - alert_email — where to send the alert

3. Inspect scheduled jobs#

swarm cron list
JOB_ID        NAME                  SCHEDULE      LAST_RUN            STATUS     NEXT_RUN
cj_abc123     fraud_drift_nightly   0 3 * * *     (never)             pending    2026-04-16 03:00 IST

4. Run now (don't wait for the schedule)#

swarm cron run cj_abc123

Completes in 30-90 seconds. Inspect the output:

swarm cron runs cj_abc123 --limit 1
RUN_ID       STATUS     STARTED              DURATION    ALERTS
cr_xyz789    succeeded  2026-04-15 14:30     47s         1 (amount feature)

Pull the detailed report:

cat ~/.swarm/cron/output/cj_abc123_cr_xyz789.log
{
  "model": "fraud", "version": "v2",
  "baseline_run": "3f2e1a8b",
  "window": { "from": "2026-04-14T03:00Z", "to": "2026-04-15T03:00Z" },
  "features": {
    "amount": {
      "ks_statistic": 0.21,   "p_value": 0.001,
      "baseline": { "mean": 450.21, "std": 1280.3 },
      "observed": { "mean": 612.55, "std": 1455.7 },
      "alert": true,
      "interpretation": "Mean transaction amount has drifted up ~36%. Investigate upstream pipeline."
    },
    "merchant_category": { "ks_statistic": 0.04, "p_value": 0.48, "alert": false },
    ...
  },
  "alerts": ["amount"],
  "overall_verdict": "investigate"
}

5. What happens on alert#

When drift exceeds threshold:

  1. Email fires to alert_email with the above summary
  2. run_events gets an entry of kind: drift_alert with the full payload
  3. Dashboard banner on /deployments/fraud/v2 shows "Drift detected"
  4. If compliance profile rbi_free_ai is enabled on this deployment: the alert is included in the next monthly audit PDF rollup

6. Investigation flow#

When drift hits, typical triage:

# Kick off an investigation pipeline
swarm pipelines run \
  --problem "Investigate drift on fraud_v2 — amount feature KS=0.21" \
  --dataset "last_24h_predictions" \
  --template default_ml_pipeline \
  --compliance rbi_free_ai

error_analyzer + model_evaluator agents will: - Slice drift by time of day, day of week, merchant segment - Compare against historical drift patterns - Flag whether this is a concept drift (performance degraded) vs feature drift (distribution shifted but performance intact) - Recommend: retrain / investigate upstream / adjust threshold

7. Schedule retraining + drift check together#

A common pattern — retrain nightly AND check drift nightly:

# Retrain weekly on Sunday
swarm cron create \
  --name "fraud_retrain_weekly" \
  --schedule "0 2 * * 0" \
  --task retrain \
  --config '{"problem": "Fraud detection", "dataset_glob": "fraud_train_*.csv", "template": "default_ml_pipeline"}'

# Drift check on the deployed model nightly
swarm cron create \
  --name "fraud_drift_nightly" \
  --schedule "0 3 * * *" \
  --task drift_check \
  --config '{"model": "fraud", "version": "current", "window_hours": 24}'

When retrain produces a new version that meets promotion criteria, you'll get an HITL gate before promote_challenger runs. See Deploy a model.

8. Edit / disable / delete#

# Change schedule
swarm cron update cj_abc123 --schedule "0 4 * * *"

# Temporarily disable
swarm cron update cj_abc123 --enabled false

# Delete
swarm cron delete cj_abc123

Advanced: custom drift task#

If the built-in drift_check doesn't fit your needs (e.g. you want a Wasserstein distance instead of KS), drop a Python callable and reference it:

swarm cron create \
  --name "fraud_wasserstein_nightly" \
  --schedule "0 3 * * *" \
  --task custom \
  --config '{"callable": "my_org.drift:wasserstein_check", "args": {"model": "fraud"}}'

Your module must be importable from the swarm API process (either shipped as a plugin or added to the Python path).

Next#