Agent Runtime Toolsets¶

Agent runtime toolsets are governed capability bundles exposed through AgentToolRegistry. They are the heavier extension surface for tools that need chat runtime integration, permission handling, plan-mode rules, tool-catalog metadata, or durable session state.

For simple examples and public builder code, prefer the lightweight pattern tool path in Add A Tool. Use runtime toolsets when the tool must participate in the same safety and observability path as sb chat, workforces, and background sessions.

Registration Flow¶

Runtime tools are registered from brain/agent/tools.py.

Add the toolset name to AgentToolRegistry.AVAILABLE_TOOL_SETS.
Create brain/agent/toolsets/<name>.py.
Export a register_<name>_tools(registry) function.
Add tool specs with descriptions, JSON-schema parameters, metadata, and implementation functions.
Mark side-effecting tools with registry._write_tools.add("<tool_name>").
Add capability/catalog metadata when the toolset should appear in audit or discovery surfaces.
Add focused tests under tests/tools/ or the owning component test folder.

Read-only tools can run in plan mode only when explicitly allowed by brain/agent/plan_mode.py. Write-capable tools remain governed by the active permission mode and approval policy.

Built-In Agent Coordination Toolsets¶

The current coordination toolsets are workspace, checklist, and workers. They are intentionally local-first and reuse existing runtime primitives.

`workspace`¶

Implementation:

brain/runtime/workspace_backend.py
brain/runtime/workspace_composite.py
brain/agent/toolsets/workspace.py

Purpose:

give agents stable virtual paths instead of raw host paths
expose the repository as /workspace
expose run-local scratch state as /scratch
support bounded list, read, glob, search, write, and exact-edit operations

Tools:

Tool	Side effects	Notes
`workspace_list`	no	Lists direct children under a virtual path.
`workspace_read`	no	Reads a numbered line window and advances a cursor on repeated reads.
`workspace_search`	no	Regex search with optional glob, context lines, case folding, and result caps.
`workspace_glob`	no	Finds files or directories by glob pattern.
`workspace_write`	yes	Writes in `overwrite`, `append`, or `create` mode.
`workspace_edit`	yes	Performs exact replacement or inserts before a 1-based line.

Path handling is virtual-root based. /workspace routes to the configured repo root, /scratch routes to state-backed scratch storage, and relative paths default to /workspace. Paths that escape a route are rejected before reaching the filesystem.

`checklist`¶

Implementation:

brain/agent/toolsets/checklist.py

Purpose:

keep a structured task list attached to the current run or durable session
make progress state inspectable after long turns or worker handoffs
persist updates as session events when a session id is available

Tools:

Tool	Side effects	Notes
`checklist_read`	no	Returns items, counts, open-item total, source, and last update metadata.
`checklist_update`	yes	Replaces or merges items and stores a `checklist_updated` event.

Checklist statuses are pending, in_progress, completed, blocked, and cancelled. While work is active, callers should keep at most one item in in_progress and reconcile open items before a final response.

`workers`¶

Implementation:

brain/agent/toolsets/workers.py
brain/background_sessions/
brain/chat/session_store.py

Purpose:

let a chat turn start long-running local work without blocking the current response
expose durable status, checkpoint, artifact, event, update, and cancellation operations through the normal tool registry

Tools:

Tool	Side effects	Notes
`worker_start`	yes	Starts a detached background session and returns its session id.
`worker_check`	no	Reads lifecycle status, checkpoints, artifacts, and optional events.
`worker_list`	no	Lists sessions by status or active-only filters.
`worker_update`	yes	Stores an operator note in metadata, checkpoints, and events.
`worker_cancel`	yes	Cancels a session and records the cancellation checkpoint.

worker_start carries provider, model, working directory, toolset selection, permission mode, approval mode, retrieval depth, retry, expiry, and metadata options into BackgroundSessionStartRequest. The tool always requests detached execution so the parent turn can continue and check back later.

Operator Checks¶

Use the tool catalog command to inspect how a toolset is exposed:

sb tools --tool-set workspace --permission-mode ask --json

Use the runtime guard commands when you need a persisted approval baseline instead of the startup-only comparison:

sb runtime-guard tools --surface runtime --web off --json
sb runtime-guard snapshot --surface runtime --web off --approved-by local-operator
sb runtime-guard mismatches --surface runtime --web off --json
sb runtime-guard approve-contract runtime local_search
sb runtime-guard topology --scope subagent --json
sb runtime-guard approve-edge <edge-id>
sb runtime-guard lock subagent

Expected behavior:

read-only workspace and checklist reads are enabled in plan mode
workspace writes, checklist updates, worker starts, worker updates, and worker cancellations are classified as write-capable
worker check/list operations stay read-only
capability metadata includes the tool family and search aliases for discovery
every row includes contract_fingerprint, registered_contract_fingerprint, contract_status, and contract_mismatch so startup-time tool contracts can be compared with their current provider-facing schema
AgentToolRegistry(contract_enforcement="enforce") blocks drifted write-capable, network, destructive, and external-write contracts before the tool function runs; read-only drift remains observable through telemetry
runtime-guard snapshot writes the current contract fingerprints as operator-approved baselines in the runtime database; future mismatches runs report new, mismatch, or missing contracts until explicitly approved
when AgentToolRegistry is constructed with a contract baseline store and contract_enforcement="enforce", write-capable and high-risk contracts fail closed unless their current fingerprint matches the persisted approved baseline
spawn_subagent can record directed agent -> subagent topology edges; locking the subagent scope makes unapproved child-agent targets fail before the child run is submitted

Test Guidance¶

Focused tests should cover both behavior and registry exposure:

backend behavior in tests/runtime/
tool output and validation in tests/tools/
plan-mode allowlist behavior in tests/agent/
CLI or catalog exposure when tool names or toolset names change

Run the narrow loop first:

pytest -q tests/runtime/test_workspace_backend.py tests/tools/test_workspace_toolset.py tests/tools/test_checklist_toolset.py tests/tools/test_workers_toolset.py tests/agent/test_plan_mode_allowlist.py

Before pushing Python changes, also run:

ruff format --check .
ruff check .

Agent Runtime Toolsets¶

Registration Flow¶

Built-In Agent Coordination Toolsets¶

workspace¶

checklist¶

workers¶

Operator Checks¶

Test Guidance¶

`workspace`¶

`checklist`¶

`workers`¶