Skip to content

Agent Runtime Toolsets

Agent runtime toolsets are governed capability bundles exposed through AgentToolRegistry. They are the heavier extension surface for tools that need chat runtime integration, permission handling, plan-mode rules, tool-catalog metadata, or durable session state.

For simple examples and public builder code, prefer the lightweight pattern tool path in Add A Tool. Use runtime toolsets when the tool must participate in the same safety and observability path as sb chat, workforces, and background sessions.

Registration Flow

Runtime tools are registered from brain/agent/tools.py.

  1. Add the toolset name to AgentToolRegistry.AVAILABLE_TOOL_SETS.
  2. Create brain/agent/toolsets/<name>.py.
  3. Export a register_<name>_tools(registry) function.
  4. Add tool specs with descriptions, JSON-schema parameters, metadata, and implementation functions.
  5. Mark side-effecting tools with registry._write_tools.add("<tool_name>").
  6. Add capability/catalog metadata when the toolset should appear in audit or discovery surfaces.
  7. Add focused tests under tests/tools/ or the owning component test folder.

Read-only tools can run in plan mode only when explicitly allowed by brain/agent/plan_mode.py. Write-capable tools remain governed by the active permission mode and approval policy.

Built-In Agent Coordination Toolsets

The current coordination toolsets are workspace, checklist, and workers. They are intentionally local-first and reuse existing runtime primitives.

workspace

Implementation:

  • brain/runtime/workspace_backend.py
  • brain/runtime/workspace_composite.py
  • brain/agent/toolsets/workspace.py

Purpose:

  • give agents stable virtual paths instead of raw host paths
  • expose the repository as /workspace
  • expose run-local scratch state as /scratch
  • support bounded list, read, glob, search, write, and exact-edit operations

Tools:

Tool Side effects Notes
workspace_list no Lists direct children under a virtual path.
workspace_read no Reads a numbered line window and advances a cursor on repeated reads.
workspace_search no Regex search with optional glob, context lines, case folding, and result caps.
workspace_glob no Finds files or directories by glob pattern.
workspace_write yes Writes in overwrite, append, or create mode.
workspace_edit yes Performs exact replacement or inserts before a 1-based line.

Path handling is virtual-root based. /workspace routes to the configured repo root, /scratch routes to state-backed scratch storage, and relative paths default to /workspace. Paths that escape a route are rejected before reaching the filesystem.

checklist

Implementation:

  • brain/agent/toolsets/checklist.py

Purpose:

  • keep a structured task list attached to the current run or durable session
  • make progress state inspectable after long turns or worker handoffs
  • persist updates as session events when a session id is available

Tools:

Tool Side effects Notes
checklist_read no Returns items, counts, open-item total, source, and last update metadata.
checklist_update yes Replaces or merges items and stores a checklist_updated event.

Checklist statuses are pending, in_progress, completed, blocked, and cancelled. While work is active, callers should keep at most one item in in_progress and reconcile open items before a final response.

workers

Implementation:

  • brain/agent/toolsets/workers.py
  • brain/background_sessions/
  • brain/chat/session_store.py

Purpose:

  • let a chat turn start long-running local work without blocking the current response
  • expose durable status, checkpoint, artifact, event, update, and cancellation operations through the normal tool registry

Tools:

Tool Side effects Notes
worker_start yes Starts a detached background session and returns its session id.
worker_check no Reads lifecycle status, checkpoints, artifacts, and optional events.
worker_list no Lists sessions by status or active-only filters.
worker_update yes Stores an operator note in metadata, checkpoints, and events.
worker_cancel yes Cancels a session and records the cancellation checkpoint.

worker_start carries provider, model, working directory, toolset selection, permission mode, approval mode, retrieval depth, retry, expiry, and metadata options into BackgroundSessionStartRequest. The tool always requests detached execution so the parent turn can continue and check back later.

Operator Checks

Use the tool catalog command to inspect how a toolset is exposed:

sb tools --tool-set workspace --permission-mode ask --json

Use the runtime guard commands when you need a persisted approval baseline instead of the startup-only comparison:

sb runtime-guard tools --surface runtime --web off --json
sb runtime-guard snapshot --surface runtime --web off --approved-by local-operator
sb runtime-guard mismatches --surface runtime --web off --json
sb runtime-guard approve-contract runtime local_search
sb runtime-guard topology --scope subagent --json
sb runtime-guard approve-edge <edge-id>
sb runtime-guard lock subagent

Expected behavior:

  • read-only workspace and checklist reads are enabled in plan mode
  • workspace writes, checklist updates, worker starts, worker updates, and worker cancellations are classified as write-capable
  • worker check/list operations stay read-only
  • capability metadata includes the tool family and search aliases for discovery
  • every row includes contract_fingerprint, registered_contract_fingerprint, contract_status, and contract_mismatch so startup-time tool contracts can be compared with their current provider-facing schema
  • AgentToolRegistry(contract_enforcement="enforce") blocks drifted write-capable, network, destructive, and external-write contracts before the tool function runs; read-only drift remains observable through telemetry
  • runtime-guard snapshot writes the current contract fingerprints as operator-approved baselines in the runtime database; future mismatches runs report new, mismatch, or missing contracts until explicitly approved
  • when AgentToolRegistry is constructed with a contract baseline store and contract_enforcement="enforce", write-capable and high-risk contracts fail closed unless their current fingerprint matches the persisted approved baseline
  • spawn_subagent can record directed agent -> subagent topology edges; locking the subagent scope makes unapproved child-agent targets fail before the child run is submitted

Test Guidance

Focused tests should cover both behavior and registry exposure:

  • backend behavior in tests/runtime/
  • tool output and validation in tests/tools/
  • plan-mode allowlist behavior in tests/agent/
  • CLI or catalog exposure when tool names or toolset names change

Run the narrow loop first:

pytest -q tests/runtime/test_workspace_backend.py tests/tools/test_workspace_toolset.py tests/tools/test_checklist_toolset.py tests/tools/test_workers_toolset.py tests/agent/test_plan_mode_allowlist.py

Before pushing Python changes, also run:

ruff format --check .
ruff check .