Agent Runtime Toolsets¶
Agent runtime toolsets are governed capability bundles exposed through
AgentToolRegistry. They are the heavier extension surface for tools that need
chat runtime integration, permission handling, plan-mode rules, tool-catalog
metadata, or durable session state.
For simple examples and public builder code, prefer the lightweight pattern
tool path in Add A Tool. Use runtime toolsets when
the tool must participate in the same safety and observability path as sb chat,
workforces, and background sessions.
Registration Flow¶
Runtime tools are registered from brain/agent/tools.py.
- Add the toolset name to
AgentToolRegistry.AVAILABLE_TOOL_SETS. - Create
brain/agent/toolsets/<name>.py. - Export a
register_<name>_tools(registry)function. - Add tool specs with descriptions, JSON-schema parameters, metadata, and implementation functions.
- Mark side-effecting tools with
registry._write_tools.add("<tool_name>"). - Add capability/catalog metadata when the toolset should appear in audit or discovery surfaces.
- Add focused tests under
tests/tools/or the owning component test folder.
Read-only tools can run in plan mode only when explicitly allowed by
brain/agent/plan_mode.py. Write-capable tools remain governed by the active
permission mode and approval policy.
Built-In Agent Coordination Toolsets¶
The current coordination toolsets are workspace, checklist, and workers.
They are intentionally local-first and reuse existing runtime primitives.
workspace¶
Implementation:
brain/runtime/workspace_backend.pybrain/runtime/workspace_composite.pybrain/agent/toolsets/workspace.py
Purpose:
- give agents stable virtual paths instead of raw host paths
- expose the repository as
/workspace - expose run-local scratch state as
/scratch - support bounded list, read, glob, search, write, and exact-edit operations
Tools:
| Tool | Side effects | Notes |
|---|---|---|
workspace_list |
no | Lists direct children under a virtual path. |
workspace_read |
no | Reads a numbered line window and advances a cursor on repeated reads. |
workspace_search |
no | Regex search with optional glob, context lines, case folding, and result caps. |
workspace_glob |
no | Finds files or directories by glob pattern. |
workspace_write |
yes | Writes in overwrite, append, or create mode. |
workspace_edit |
yes | Performs exact replacement or inserts before a 1-based line. |
Path handling is virtual-root based. /workspace routes to the configured repo
root, /scratch routes to state-backed scratch storage, and relative paths
default to /workspace. Paths that escape a route are rejected before reaching
the filesystem.
checklist¶
Implementation:
brain/agent/toolsets/checklist.py
Purpose:
- keep a structured task list attached to the current run or durable session
- make progress state inspectable after long turns or worker handoffs
- persist updates as session events when a session id is available
Tools:
| Tool | Side effects | Notes |
|---|---|---|
checklist_read |
no | Returns items, counts, open-item total, source, and last update metadata. |
checklist_update |
yes | Replaces or merges items and stores a checklist_updated event. |
Checklist statuses are pending, in_progress, completed, blocked, and
cancelled. While work is active, callers should keep at most one item in
in_progress and reconcile open items before a final response.
workers¶
Implementation:
brain/agent/toolsets/workers.pybrain/background_sessions/brain/chat/session_store.py
Purpose:
- let a chat turn start long-running local work without blocking the current response
- expose durable status, checkpoint, artifact, event, update, and cancellation operations through the normal tool registry
Tools:
| Tool | Side effects | Notes |
|---|---|---|
worker_start |
yes | Starts a detached background session and returns its session id. |
worker_check |
no | Reads lifecycle status, checkpoints, artifacts, and optional events. |
worker_list |
no | Lists sessions by status or active-only filters. |
worker_update |
yes | Stores an operator note in metadata, checkpoints, and events. |
worker_cancel |
yes | Cancels a session and records the cancellation checkpoint. |
worker_start carries provider, model, working directory, toolset selection,
permission mode, approval mode, retrieval depth, retry, expiry, and metadata
options into BackgroundSessionStartRequest. The tool always requests detached
execution so the parent turn can continue and check back later.
Operator Checks¶
Use the tool catalog command to inspect how a toolset is exposed:
Use the runtime guard commands when you need a persisted approval baseline instead of the startup-only comparison:
sb runtime-guard tools --surface runtime --web off --json
sb runtime-guard snapshot --surface runtime --web off --approved-by local-operator
sb runtime-guard mismatches --surface runtime --web off --json
sb runtime-guard approve-contract runtime local_search
sb runtime-guard topology --scope subagent --json
sb runtime-guard approve-edge <edge-id>
sb runtime-guard lock subagent
Expected behavior:
- read-only workspace and checklist reads are enabled in plan mode
- workspace writes, checklist updates, worker starts, worker updates, and worker cancellations are classified as write-capable
- worker check/list operations stay read-only
- capability metadata includes the tool family and search aliases for discovery
- every row includes
contract_fingerprint,registered_contract_fingerprint,contract_status, andcontract_mismatchso startup-time tool contracts can be compared with their current provider-facing schema AgentToolRegistry(contract_enforcement="enforce")blocks drifted write-capable, network, destructive, and external-write contracts before the tool function runs; read-only drift remains observable through telemetryruntime-guard snapshotwrites the current contract fingerprints as operator-approved baselines in the runtime database; futuremismatchesruns reportnew,mismatch, ormissingcontracts until explicitly approved- when
AgentToolRegistryis constructed with a contract baseline store andcontract_enforcement="enforce", write-capable and high-risk contracts fail closed unless their current fingerprint matches the persisted approved baseline spawn_subagentcan record directedagent -> subagenttopology edges; locking thesubagentscope makes unapproved child-agent targets fail before the child run is submitted
Test Guidance¶
Focused tests should cover both behavior and registry exposure:
- backend behavior in
tests/runtime/ - tool output and validation in
tests/tools/ - plan-mode allowlist behavior in
tests/agent/ - CLI or catalog exposure when tool names or toolset names change
Run the narrow loop first:
pytest -q tests/runtime/test_workspace_backend.py tests/tools/test_workspace_toolset.py tests/tools/test_checklist_toolset.py tests/tools/test_workers_toolset.py tests/agent/test_plan_mode_allowlist.py
Before pushing Python changes, also run: