Skip to content

Pipeline YAML schema#

Pipeline templates live at ml_team/config/pipelines/<name>.yaml. Three ship with swarm (fast_prototype, default_ml_pipeline, parallel_research).

Schema (Pydantic-validated)#

# REQUIRED
name: string                          # unique ID
description: string
teams:                                # which teams this pipeline uses
  - algorithm | data | training | evaluation | deployment | management | quality
agents:                               # list of agents to activate
  - string                            # must exist in AGENT_DEFS or plugin registry

# OPTIONAL
flow: sequential | graph              # default: sequential
max_iterations: integer               # default: 5; ReAct loop cap per agent
evaluator_grading: boolean            # default: false; clean-context grader on agent output
compliance_profile: none | rbi_free_ai | hipaa | eu_ai_act_high_risk  # default: none

# For flow=graph only
dependencies:
  <agent_name>:
    - <other_agent>   # this agent runs after the listed ones

# Budgets (optional)
budget:
  max_cost_usd: float                 # hard cap per run
  max_llm_calls: integer              # max LLM calls across all agents
  max_tool_calls: integer             # max tool invocations
  max_wall_clock_minutes: integer     # timeout

# Retry policy (optional)
retry:
  max_attempts: integer               # default: 3
  retry_on: [llm_timeout, tool_failure, ...]
  backoff_seconds: float

fast_prototype (shipped)#

name: fast_prototype
description: Rapid iteration on small datasets + laptop-scale models.
teams: [management, data, algorithm, training, evaluation]
agents:
  - ml_director
  - data_profiler
  - algorithm_selector
  - trainer
  - model_evaluator
flow: sequential
max_iterations: 3
evaluator_grading: true

Runtime: 3-8 min. Use for initial exploration.

default_ml_pipeline (shipped)#

name: default_ml_pipeline
description: Production-quality ML pipeline with compliance + quality gates.
teams: [management, data, algorithm, training, evaluation, deployment, quality]
agents:
  - ml_director
  - pipeline_planner
  - data_request_agent
  - data_profiler
  - data_cleaner
  - data_validator
  - feature_engineer
  - problem_classifier
  - algorithm_selector
  - model_recommender
  - repo_fetcher
  - compute_estimator
  - hyperparam_tuner
  - trainer
  - smoke_tester
  - experiment_tracker
  - model_evaluator
  - error_analyzer
  - convergence_monitor
  - model_comparator
  - code_reviewer
  - documentation_agent
  - reproducibility_agent
  - audit_logger
flow: sequential
max_iterations: 5
evaluator_grading: true

Runtime: 15-45 min. Use for production-candidate models.

parallel_research (shipped)#

name: parallel_research
description: Parallel algorithm bake-off for hyperparam / architecture sweep.
teams: [management, data, algorithm, training, evaluation]
agents:
  - ml_director
  - data_profiler
  - algorithm_selector       # picks N algorithms to test in parallel
  - trainer__xgboost
  - trainer__lightgbm
  - trainer__elastic_net
  - trainer__catboost
  - model_comparator          # fans in after all trainers complete
flow: graph
dependencies:
  trainer__xgboost: [algorithm_selector]
  trainer__lightgbm: [algorithm_selector]
  trainer__elastic_net: [algorithm_selector]
  trainer__catboost: [algorithm_selector]
  model_comparator: [trainer__xgboost, trainer__lightgbm, trainer__elastic_net, trainer__catboost]
max_iterations: 5

The trainer__{algo} entries are synthetic pipeline-scoped agent instances generated from the base trainer agent with a pinned algorithm.

Writing your own#

  1. Copy fast_prototype.yaml as a starting point
  2. Edit teams, agents, flow for your use case
  3. Validate: swarm config validate
  4. Invoke: swarm pipelines run --template <your_name> ...
  5. Docs-drift CI will require updating ml_team/config/IMPLEMENTATION_README.md

Defaults + fallbacks#

If a referenced agent doesn't exist at runtime: - Built-in agents: error at load time - Plugin agents (namespaced plugin-X::agent): warning at load time; pipeline continues with the agent omitted

Next#