Skip to content

Agent definitions schema#

Two parts to defining an agent:

  1. Python dataclass in ml_team/config/agent_defs.py::AGENT_DEFS — identity + role + tools + model tier
  2. Operational rules YAML in ml_team/config/agent_rules/<agent_name>.yaml — runtime behaviour rules

Built-in agents live in the Python file for type-safety. Plugin-contributed agents live as agents/<name>.md in the plugin directory (see Install a CC plugin).

AgentConfig Python dataclass#

from dataclasses import dataclass, field
from typing import Literal

@dataclass
class AgentConfig:
    # REQUIRED
    name: str                           # unique identifier, snake_case
    team: Literal[
        'management', 'data', 'algorithm', 'training',
        'evaluation', 'deployment', 'quality'
    ]
    role: str                           # 1-3 sentence system-prompt framing
    tools: list[str]                    # tool allowlist (enforced by permission engine)

    # OPTIONAL
    model_tier: Literal['fast', 'standard', 'deep'] = 'standard'
    model_override: str | None = None    # e.g. "anthropic/claude-opus-4-1-20260315"
    rules_path: str | None = None        # relative path to operational-rules YAML
    system_prompt_override: str | None = None
    retry_on_error: bool = True
    max_iterations: int = 5              # ReAct loop cap
    timeout_seconds: int = 600           # wall-clock cap
    temperature: float = 0.3
    feature_flags: list[str] = field(default_factory=list)  # flags gating this agent

Example — how algorithm_selector is defined:

AgentConfig(
    name='algorithm_selector',
    team='algorithm',
    role=(
        'Given a problem statement + data profile, you pick the best-fit '
        'algorithm family from the registry. You consider: problem type '
        '(classification / regression / ranking), data size, feature '
        'dimensionality, interpretability requirements, and compute '
        'constraints. You explain your choice.'
    ),
    tools=['query_algorithm_registry', 'lookup_algorithm', 'web_search'],
    rules_path='config/agent_rules/algorithm_selector.yaml',
    model_tier='standard',
    max_iterations=3,
)

Operational rules YAML#

version: "1.0"                         # schema version
agent: algorithm_selector              # must match AgentConfig.name
rules:
  - id: R001                           # short identifier
    name: "Human-readable rule name"   # optional
    rule: "The imperative rule text that gets appended to the agent's system prompt."
    category: correctness | safety | ops | style   # optional taxonomy
    priority: integer                  # optional ordering (higher first)

Example — config/agent_rules/algorithm_selector.yaml:

version: "1.0"
agent: algorithm_selector
rules:
  - id: R001
    name: "Prefer interpretable models for BFSI"
    rule: "When the compliance profile is rbi_free_ai or bfsi, prefer logistic/linear/tree-based algorithms over black-box ensembles unless the data demands it. Explain your choice."
    category: correctness
  - id: R002
    name: "Use the registry"
    rule: "Always call `query_algorithm_registry` before recommending. Do not invent algorithms that aren't in the registry."
    category: correctness
  - id: R003
    name: "Single-family recommendation"
    rule: "Recommend exactly one algorithm family. If unsure between two, call `lookup_algorithm` on both and pick based on data shape, not preference."
    category: style
  - id: R004
    name: "Escalate on ambiguity"
    rule: "If problem statement is ambiguous (unclear target column or problem type), ask `problem_classifier` before proceeding. Do not guess."
    category: safety

Rule composition#

Rules get concatenated into the agent's system prompt at runtime:

<base role description>

Operational rules:
R001 [correctness]: When the compliance profile is rbi_free_ai...
R002 [correctness]: Always call query_algorithm_registry...
R003 [style]: Recommend exactly one algorithm family...
...

Writing effective rules#

  • Imperative, not descriptive — "Call X before Y" beats "X is called before Y"
  • Short — under 200 chars each. Longer rules get ignored by the LLM.
  • Concrete — "use KS threshold 0.15" beats "use appropriate threshold"
  • Justified — implicit why. If it's not obvious, spell it out.
  • Testable — write a regression test that asserts the rule fires when it should

Plugin-contributed agents#

Plugin agents use Markdown with YAML frontmatter, not separate files:

---
name: kyc-validator
description: Validates Indian KYC payloads.
model: fast
tools:
  - execute_python
---

You are a KYC validator. Validate PAN format, Aadhaar length, address completeness. Do not log PII.

The plugin loader auto-translates to a runtime PluginAgent and namespaces it as plugin-<plugin>::<name>.

Next#