Governed Action Runtime Improvement Plan¶
Purpose¶
SecondBrain already has strong local-first building blocks: kernel tool policies, approval gates, agent identity, MCP governance, background-session health, event logs, and privacy audits. The next improvement is to turn those pieces into a single governed action runtime: the model can reason freely, but every side-effecting action must pass a deterministic runtime boundary before it executes, and every result must be safe to re-enter context or memory before the model sees it again.
The goal is not a new agent loop. The goal is a shared enforcement layer that existing loops can call from the kernel, chat harness, gateway runtime, background sessions, MCP adapters, and connector-backed tools.
Design Principles¶
- Keep governance deterministic. Tool routing, fingerprints, topology locks, scopes, risk thresholds, and quarantine decisions must be inspectable policy, not model judgment.
- Extend existing surfaces first:
RunContext,ToolSpec,ToolPolicy,ToolExecutor,AgentToolRegistry, MCP governance, approval stores, and background-session health. - Start with the highest-risk tools, then expand to read tools only after the enforcement path is proven.
- Preserve local-first operation. No external control plane dependency is required for the baseline.
- Fail closed for changed high-risk contracts when enforcement is enabled, and fail visibly in observe mode.
- Keep naming SecondBrain-native in code, docs, migrations, tests, and fixtures.
Current Assets To Reuse¶
| Area | Existing surface |
|---|---|
| Kernel tool policy | brain/kernel/tooling/permissions.py |
| Kernel execution boundary | brain/kernel/tooling/executor.py |
| Tool contracts | brain/kernel/contracts.py::ToolSpec |
| Runtime toolsets | brain/agent/tools.py, brain/agent/toolsets/ |
| MCP governance | brain/mcp/, brain/adapters/mcp/, brain/agent/toolsets/mcp.py |
| Agent identity and scopes | brain/agent_identity/, docs/agent_identity_design.md |
| Gateway approvals | brain/gateway/, brain/agent_runtime/policy.py |
| Background health | brain/background_sessions/health_policy.py |
| Event evidence | brain/state/event_log.py, docs/explanation/observability.md |
| Policy explanation | docs/explanation/policy-and-approvals.md |
Gap Summary¶
| Gap | Impact |
|---|---|
| Tool registries are layered but not contract-fingerprinted consistently. | A changed tool description or schema can be hard to distinguish from a normal registration. |
| Agent identity scope enforcement is still opt-in across many entrypoints. | A side-effecting path may rely on approval mode without also proving per-agent authority. |
| Delegation and worker relationships are observable in places, but not governed as an approved topology. | Agents can form new call paths without an explicit operator-reviewed edge. |
| Tool outputs are not uniformly scanned before returning to model context or memory. | Secrets, PII, or injected instructions can re-enter the agent loop. |
| Risk signals exist across health snapshots and functional state, but there is no unified per-agent trust score. | Operators see warnings, but policy cannot consistently degrade, pause, or quarantine. |
| Revocation exists for manifests and approvals, but not as a runtime cascade across active sessions and descendants. | A compromised session may require manual cleanup across multiple stores. |
Target Architecture¶
Agent / runtime loop
-> ActionGuard.before_tool_call()
- verify agent identity and scopes
- verify tool contract fingerprint
- verify approved topology edge
- check approval, risk, budget, rate, and policy
-> tool execution
-> ActionGuard.after_tool_response()
- scan output before context re-entry
- update risk score and event evidence
- apply quarantine or degradation if needed
-> model context / memory / operator trace
The implementation should be additive. Early phases can run in observe mode and only block the highest-risk tools once the operator has approved the baseline.
Core Building Blocks¶
1. Tool Contract Fingerprints¶
Add a stable fingerprint for each registered tool contract:
- inputs: tool id, name, description, safety class, argument schema, return schema, required permissions, required identity scopes, and risk metadata
- output: deterministic hash stored in registry snapshots and tool-call events
- behavior: observe mismatches first, then deny high-risk changed contracts unless an operator approves the new contract
Implementation targets:
brain/kernel/contracts.pybrain/kernel/tooling/registry.pybrain/agent/tools.pybrain/agent/toolsets/mcp.pybrain/mcp/discovery.pyand discovery cache codebrain/state/event_log.pymetadata fields
Tests:
- deterministic fingerprint ignores dictionary ordering
- description/schema/scope changes alter the fingerprint
- high-risk mismatch is denied in enforce mode
- observe mode logs mismatch without blocking
2. Shared Action Guard¶
Introduce one small orchestration object around existing policies rather than moving all logic into a new layer.
Suggested shape:
class ActionGuard:
def before_tool_call(self, ctx, spec, args, *, call_chain): ...
def after_tool_response(self, ctx, spec, result, *, call_chain): ...
The guard should call existing policy and approval components first. It should only own cross-cutting checks that span tool registries:
- contract fingerprint verification
- agent identity scope verification
- topology edge verification
- output re-entry scanning
- risk-score update
- quarantine check
Implementation targets:
- new
brain/runtime/action_guard.pyorbrain/kernel/tooling/action_guard.py brain/kernel/tooling/executor.pybrain/agent/tool_executor_v2.pybrain/agent_runtime/tools.pybrain/mcp/executor.py
Tests:
- kernel executor still preserves current allow, deny, approval, and budget behavior
- guard failures emit structured event metadata
- post-response scanner can redact or block before model re-entry
3. Approved Delegation Topology¶
Model runtime relationships as directed edges:
agent -> toolagent -> subagentagent -> worker sessionagent -> MCP server/toolautomation -> agent
Each edge has a lifecycle:
observed: discovered and logged, no enforcementapproved: accepted by operator or policylocked: only approved edges are allowed
Implementation targets:
- new runtime DB tables for topology edges and lock state
- CLI:
sb runtime-guard topology|approve-edge|lock|unlock - UI later: operator view beside background sessions and approvals
- first integration:
spawn_subagentrecordsagent -> subagentedges and locked mode denies unapproved child-agent targets - later integrations:
worker_start, local A2A dispatch, MCP tool calls, and background-session spawning
Tests:
- direction matters:
A -> Bdoes not implyB -> A - locked mode denies new unapproved edges
- observe mode records but allows
- approved edge survives restart
4. Per-Call Capability Claims¶
Use existing IdentityClaim and scope narrowing as the foundation. Add a
short-lived per-call authorization proof for side-effecting actions:
- bound to
run_id,trace_id,span_id, tool id, contract fingerprint, scopes, approval id if any, and expiry - nonces or call ids prevent replay inside a run
- child sessions receive narrowed claims, never expanded claims
Implementation targets:
brain/agent_identity/claim.pybrain/agent_identity/runtime.pybrain/kernel/run_context.pybrain/kernel/tooling/permissions.py- gateway and background-session start paths
Tests:
- expired call proof fails
- tool id mismatch fails
- contract fingerprint mismatch fails
- approval does not add missing identity scopes
5. Context Re-Entry Scanner¶
Every tool result that may be shown to a model or written to memory should pass through a response inspection step.
Minimum scanner behavior:
- detect likely secrets, tokens, private keys, and credential-shaped text
- detect PII classes already represented in privacy audit or memory review
- detect tool-result prompt injection patterns
- emit redaction metadata and evidence refs
- block only high-confidence sensitive data in early enforcement mode
Implementation targets:
brain/security/content_safety.pybrain/agent/tool_outcomes.pybrain/kernel/tooling/executor.py- memory proposal and review queue paths
- MCP resource and prompt fetch paths
Tests:
- secret-shaped output is redacted before model context
- blocked output records an inspectable event
- memory ingestion receives scanner metadata
- low-confidence findings warn without losing normal output
6. Agent Trust Score¶
Unify existing risk signals into a per-agent/session trust score with decay.
Inputs:
- denied tool calls
- repeated approval requests
- contract mismatches
- topology violations
- scanner findings
- repeated failed tools
- budget pressure and health-policy breaches
- functional-state risk signals
Actions:
- warn
- require approval for next privileged action
- degrade to read-only
- pause background session
- quarantine session and descendants
Implementation targets:
brain/background_sessions/health_policy.pybrain/antahkarana/functional_state/brain/state/event_log.py- new risk aggregation helper under
brain/runtime/orbrain/background_sessions/
Tests:
- score decays over time
- repeated high-risk events cross thresholds deterministically
- read-only degradation preserves diagnostic tools
- thresholds are configurable and visible in audit output
7. Quarantine And Revocation Cascade¶
Add an operator action and policy action that can stop a compromised runtime branch:
- revoke active session authority
- pause or cancel descendant worker/background sessions
- reject future tool calls with the revoked claim or session id
- keep read-only inspection available for audit
- record a single cascade event with affected descendants
Implementation targets:
brain/agent_identity/brain/background_sessions/store.pybrain/background_sessions/runtime.pybrain/agent_runtime/policy.pybrain/gateway/- CLI:
sb agent-identity quarantineorsb sessions quarantine
Tests:
- quarantined parent prevents child tool calls
- read-only trace inspection still works
- cascade is idempotent
- restart preserves quarantine state
First Vertical Slice¶
Ship the smallest end-to-end path around a high-risk local target:
- Add deterministic tool contract fingerprints to kernel
ToolSpec. - Log the fingerprint on every kernel tool call.
- Add observe/enforce modes to the kernel executor for fingerprint mismatches.
- Enable enforcement for one destructive or external-write tool class only.
- Add a focused CLI/audit view showing current fingerprints and mismatches.
- Add tests proving existing approval behavior is unchanged.
Candidate high-risk tools:
- external message send tools
- planner task create/update/delete
mcp_call_toolopenapi_invokehttp_profile_request- file write/edit/delete surfaces
- background
run_command - travel booking or cancellation actions
Phased Delivery Plan¶
Phase 0 — Inventory And Baseline¶
Deliverables:
- inventory all tool registries and high-risk tools
- classify tool risk tiers consistently across kernel, runtime toolsets, MCP, connectors, and background sessions
- add a docs table mapping risk tiers to approval, identity, and scanner expectations
Verification:
pytest -q tests/kernel/test_kernel_tooling.py- targeted MCP/toolset catalog tests
sb privacy audit --json
Phase 1 — Contract Fingerprints¶
Deliverables:
ToolSpecfingerprint helper- fingerprint metadata in registry listings and tool-call events
- observe-mode mismatch events
- enforce-mode block for selected high-risk tools
- persisted runtime-database baselines for operator-approved tool contracts
sb runtime-guard tools|mismatches|snapshot|approve-contractfor current contract inspection and approval- optional execution enforcement that denies high-risk contracts when the persisted approved baseline is missing, changed, or unavailable
Verification:
- new kernel tests for fingerprint determinism and mismatch behavior
- existing kernel tooling tests unchanged
- runtime baseline store tests and runtime-guard CLI integration tests
Phase 2 — Side-Effect Action Guard¶
Deliverables:
- shared
ActionGuardwrapper around existing policy decisions - kernel executor integration
- one agent toolset integration
- one MCP tool-call integration
Verification:
- approval, denial, and budget tests continue to pass
- new tests cover guard metadata and failure surfaces
Phase 3 — Identity Required For High-Risk Tools¶
Deliverables:
- default identity minting for selected entrypoints that call external-write or destructive tools
identity_required=Trueon high-risk execution paths- per-call proof design using existing claim primitives
Verification:
- missing scopes deny high-risk calls
- approval does not override missing identity scope
- child claims are narrowed across worker/subagent boundaries
Phase 4 — Observed, Approved, Locked Topology¶
Deliverables:
- topology edge store
- observe/approve/lock CLI
- edge recording for worker starts, local A2A, MCP calls, and selected toolsets
- locked-mode enforcement
Verification:
- directed edge tests
- restart persistence tests
- audit output shows unapproved observed edges
Phase 5 — Context Re-Entry Scanner¶
Deliverables:
- scanner integrated after tool responses and before memory proposals
- redaction/block metadata in events
- scanner findings reflected in health snapshots
Verification:
- secret and injection fixtures are redacted or blocked
- memory review receives scanner evidence
- normal low-risk outputs are unchanged
Phase 6 — Trust Score And Quarantine¶
Deliverables:
- per-agent/session score with decay
- thresholds connected to warn, approval-required, read-only degrade, pause, and quarantine actions
- cascade revocation for child sessions
- operator inspection command
Verification:
- deterministic score tests
- quarantine idempotency tests
- background-session health tests
- gateway/session restart tests
Operator UX¶
Minimum CLI surface:
sb runtime-guard tools
sb runtime-guard mismatches
sb runtime-guard snapshot
sb runtime-guard approve-contract <surface> <tool-name>
sb runtime-guard topology
sb runtime-guard approve-edge <edge-id>
sb runtime-guard lock <scope>
sb runtime-guard unlock <scope>
sb runtime-guard risk <session-id>
sb sessions quarantine <session-id>
The UI can come after the data model stabilizes. It should reuse the existing Agent Cockpit, approvals, sessions, and health surfaces rather than adding a separate control plane.
Testing Strategy¶
Add tests at the lowest layer that owns each behavior:
- kernel: fingerprints, identity scope checks, approval preservation
- agent toolsets: catalog exposure and side-effect classification
- MCP: discovery hash, changed schema handling, approval reuse interaction
- background sessions: topology edges, trust score, pause/quarantine
- serve/UI contract: only if new stream events are added
Run targeted loops first, then broader checks before merge:
pytest -q tests/kernel/test_kernel_tooling.py
pytest -q tests/tools tests/agent tests/runtime
ruff format --check .
ruff check .
Definition Of Done¶
- High-risk tool calls have a visible contract fingerprint.
- Changed high-risk contracts can be observed, approved, or denied.
- Side-effecting calls can require a signed, scoped agent identity.
- Delegation edges are observable and lockable by direction.
- Tool outputs that re-enter context or memory pass through scanner metadata.
- Risk accumulation can deterministically degrade or pause a session.
- Quarantine blocks descendants while preserving audit inspection.
- Docs, help text, and audit surfaces explain the active posture.