Skip to content

Write a custom agent#

Goal: add your own specialist to the 40-agent catalogue — say, a credit_bureau_lookup agent for Indian BFSI. Time: 30 minutes.

When to write a new agent vs use an existing one#

Write one if: - You have a domain-specific role that recurs across pipelines ("KYC validator", "credit bureau lookup") - You want per-agent tool allowlisting, a specific system prompt, or custom rules - You plan to call it from multiple templates

Don't write one if: - It's a one-off transformation (use a tool instead) - It's glue logic (use a pipeline-level step, not an agent)

1. Decide the shape#

Answer these 4 questions before writing:

  1. Role: what's the 1-sentence description?
  2. Team: which of the 7 teams does it belong to? (management / data / algorithm / training / evaluation / quality / deployment)
  3. Tools: which existing tools should it call? (3-8 is typical)
  4. Model tier: fast (Haiku) / standard (Sonnet) / deep (Opus)?

Example worked answer:

  • Role: "Looks up the credit bureau score for a PAN/Aadhaar and annotates the record"
  • Team: data
  • Tools: execute_python, web_search, load_dataset
  • Model tier: fast (it's a lookup; no reasoning needed)

2. Define the agent#

Open ml_team/config/agent_defs.py:

# Append to AGENT_DEFS
AgentConfig(
    name="credit_bureau_lookup",
    team="data",
    role=(
        "You look up credit bureau scores (CIBIL, Experian, Equifax, CRIF) "
        "for records with valid PAN or Aadhaar. You annotate each record "
        "with the score, the bureau source, and the retrieval timestamp. "
        "You never log or persist the PAN/Aadhaar in cleartext; you pass "
        "through only to the bureau API."
    ),
    tools=["execute_python", "web_search"],
    rules_path="config/agent_rules/credit_bureau_lookup.yaml",
    model_tier="fast",
    retry_on_error=True,
    max_iterations=5,
),

3. Write operational rules#

Rules govern behaviour that the LLM keeps forgetting without explicit reminders. Create ml_team/config/agent_rules/credit_bureau_lookup.yaml:

# Operational rules for credit_bureau_lookup agent
version: "1.0"
rules:
  - id: R001
    name: "PAN is always masked in logs"
    rule: "When logging or writing intermediate state, mask PAN as 'ABCDE1234F'  'ABC***234F'."
  - id: R002
    name: "Rate limit bureau calls"
    rule: "Max 100 bureau API calls per minute. If exceeded, sleep 5s and retry."
  - id: R003
    name: "Prefer CIBIL for Indian records"
    rule: "For Indian residents, use CIBIL first. Fall back to CRIF if CIBIL times out."
  - id: R004
    name: "Never persist bureau response verbatim"
    rule: "Extract score + band only; do not persist the full bureau JSON response."
  - id: R005
    name: "Escalate on bureau outage"
    rule: "If all bureau APIs time out, fail the agent turn with 'BUREAU_OUTAGE'. Do not proceed."

Rules get injected into the agent's system prompt at runtime. Keep them: - Operational, not conceptual (rules tell the agent what to do in specific situations) - Short (each under 1-2 sentences) - Enforceable — ideally paired with a tool or permission policy

4. Write a test#

Create ml_team/tests/test_credit_bureau_lookup.py:

from ml_team.core.team_factory import build_agent
from ml_team.core.feature_flags import set_runtime, reset_runtime

def test_agent_has_correct_allowlist():
    agent = build_agent("credit_bureau_lookup")
    assert set(agent.tools) == {"execute_python", "web_search"}

def test_agent_team():
    agent = build_agent("credit_bureau_lookup")
    assert agent.team == "data"

def test_agent_loads_operational_rules():
    agent = build_agent("credit_bureau_lookup")
    assert "PAN is always masked" in agent.system_prompt
    assert "CIBIL first" in agent.system_prompt

def test_agent_cannot_call_non_allowlisted_tool(monkeypatch):
    # Permission engine test: attempt to call `deploy_serving` should be denied
    from ml_team.core.tool_executor import ToolExecutor
    from ml_team.core.types import ToolCall

    agent = build_agent("credit_bureau_lookup")
    executor = ToolExecutor(agent=agent)
    result = executor.execute(ToolCall(
        id="t1",
        name="deploy_serving",
        arguments={"model_id": "x", "env": "prod"},
    ))
    assert result.error == "denied"
    assert "agent_allowlist" in result.denial_reason

Run:

.venv/bin/pytest ml_team/tests/test_credit_bureau_lookup.py -v

5. Reference the agent in a pipeline#

Either add it to an existing pipeline template:

# config/pipelines/kyc_pipeline.yaml
name: kyc_pipeline
description: KYC + credit-bureau enrichment + fraud scoring
teams:
  - data
  - algorithm
  - training
  - evaluation
agents:
  - ml_director
  - data_profiler
  - credit_bureau_lookup     # <-- your new agent
  - algorithm_selector
  - trainer
  - model_evaluator
flow: sequential

Or invoke it ad-hoc:

swarm agents invoke credit_bureau_lookup \
  --input '{"pan": "ABCDE1234F"}'

6. Attach compliance-profile guardrails#

If this agent handles sensitive data (PII per DPDPA, PHI per HIPAA), ensure the compliance profile activates the right controls:

# config/permission_policies_hipaa.yaml
rules:
  - tool: "web_search"
    when:
      agent: "credit_bureau_lookup"
    behaviour: "deny"
    reason: "HIPAA: no external web search from PHI-handling agents"

Under HIPAA profile, the agent's web_search call is denied; under BFSI profile, allowed.

7. Document it#

Update ml_team/config/LEARNING_README.md to add your agent to the catalogue. CI's doc-drift.yml workflow will remind you if you forget.

Advanced patterns#

Multi-stage agent#

If your agent needs to do lookup → score → annotate → validate, consider splitting into 2-3 agents instead. Smaller agents compose better and fail more predictably.

Agent with a custom LLM prompt#

If the default agent prompt template doesn't fit, you can override:

AgentConfig(
    name="credit_bureau_lookup",
    ...
    system_prompt_override="custom/credit_bureau_prompt.md",  # loaded from config/prompts/
)

But usually you don't need this — the role + rules combo is enough.

Plugin-contributed agents#

If you're writing a plugin for external distribution, your agent lives in agents/credit_bureau_lookup.md, NOT in AGENT_DEFS. See Install a CC plugin. Namespace is auto-prefixed as plugin-<plugin>::credit_bureau_lookup.

Next#