Train your first classifier#

Goal: train an iris-species classifier using the fast_prototype pipeline, inspect the model card, and understand each agent's contribution. Time: ~5 minutes after quickstart.

Prerequisites#

Quickstart complete (stack running on localhost)
You've seen iris.csv in the dataset list
ANTHROPIC_API_KEY or OPENAI_API_KEY set in .env

1. Kick off the run#

CLIRESTDashboard

swarm pipelines run \
  --problem "Classify iris flowers into species using sepal and petal measurements" \
  --dataset iris.csv \
  --template fast_prototype \
  --name "my-first-run"

curl -X POST http://localhost:8000/api/v1/pipelines \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "problem_statement": "Classify iris flowers into species using sepal and petal measurements",
    "dataset_path": "iris.csv",
    "template": "fast_prototype",
    "name": "my-first-run"
  }'

Pipelines → New → fill in the form → Run.

You get back a run_id like 7f8e9a2b.

2. Watch it run#

swarm pipelines status 7f8e9a2b --follow

You'll see:

ml_director reads the problem, plans the pipeline
data_profiler loads iris.csv, reports column types, cardinality
algorithm_selector reads the profile, picks logistic regression (simple, appropriate for 3-class 4-feature)
trainer fits the model on an 80/20 split
model_evaluator reports accuracy, per-class precision/recall, confusion matrix

Typical runtime: 90-180 seconds.

3. Inspect the outputs#

ls pipeline_runs/7f8e9a2b/

conversations/
  ├── ml_director.jsonl
  ├── data_profiler.jsonl
  ├── algorithm_selector.jsonl
  ├── trainer.jsonl
  └── model_evaluator.jsonl
models/
  └── model.joblib
reports/
  ├── model_card.md
  ├── evaluation.json
  └── confusion_matrix.png
run_events.jsonl

The model card#

cat pipeline_runs/7f8e9a2b/reports/model_card.md

Reads like:

# Iris species classifier (v0)

## Model
- Algorithm: Logistic Regression (scikit-learn 1.8)
- Parameters: multi_class=multinomial, solver=lbfgs, C=1.0
- Trained: 2026-04-15 14:22 IST
- Training size: 120 samples; test size: 30 samples

## Performance
- Accuracy: 0.9667 ± 0.0152
- Macro F1: 0.9674
- Confusion matrix: reports/confusion_matrix.png

## Data
- Source: iris.csv (seeded)
- Features: sepal_length, sepal_width, petal_length, petal_width
- Target: species (3 classes)
- Missing values: 0
- Class balance: 50/50/50

## Limitations
- 150 samples — very small. Production use requires more data.
- No fairness audit attached (protected attribute not relevant for botany).

## Version
- Pipeline: fast_prototype
- swarm: 0.11.0
- Run ID: 7f8e9a2b

Per-agent conversations#

Each agent's JSONL journal has a full record of every LLM call, tool call, and intermediate reasoning step. Useful for: - Debugging ("why did algorithm_selector pick logistic regression?") - Auditing ("show me every tool call made during this run") - Feedback / fine-tuning data later

jq -c '.[] | select(.kind == "tool_call")' pipeline_runs/7f8e9a2b/conversations/trainer.jsonl

4. Try a different template#

swarm pipelines run \
  --problem "Classify iris flowers into species" \
  --dataset iris.csv \
  --template default_ml_pipeline      # 22 agents instead of 5

You'll see: - hyperparam_tuner runs a grid search over multiple algorithms - model_comparator picks the best - fairness_auditor attaches a (trivially-passing) fairness report - documentation_agent expands the model card - reproducibility_agent pins seeds + versions

Takes longer (~15-30 min) but produces a production-quality run.

5. Next#

Deploy a model — take this trained model to shadow traffic and promotion
Generate an audit PDF — the regulator-format artefact
Write a custom agent — add your own specialist
End-to-end fraud classifier tutorial — a realistic BFSI flow

Troubleshooting#

Pipeline stalls at algorithm_selector

Check logs: docker compose logs api | grep algorithm_selector. Usually means the LLM call timed out (API key missing / rate limited). Verify .env and retry.

Training completes but model accuracy is 0.33

Something's wrong — that's random-guess for 3 classes. Inspect data_profiler.jsonl — likely the target column wasn't identified. Rerun with an explicit target: --target species.

I want to use my own dataset

cp /path/to/your/data.csv ml_team/data/
# Restart api container so it picks up the new file
docker compose restart api
# Now visible in dashboard + CLI
swarm pipelines run --problem "..." --dataset your_data.csv --template fast_prototype