Skip to content

Governed Action Runtime Improvement Plan

Purpose

SecondBrain already has strong local-first building blocks: kernel tool policies, approval gates, agent identity, MCP governance, background-session health, event logs, and privacy audits. The next improvement is to turn those pieces into a single governed action runtime: the model can reason freely, but every side-effecting action must pass a deterministic runtime boundary before it executes, and every result must be safe to re-enter context or memory before the model sees it again.

The goal is not a new agent loop. The goal is a shared enforcement layer that existing loops can call from the kernel, chat harness, gateway runtime, background sessions, MCP adapters, and connector-backed tools.

Design Principles

  • Keep governance deterministic. Tool routing, fingerprints, topology locks, scopes, risk thresholds, and quarantine decisions must be inspectable policy, not model judgment.
  • Extend existing surfaces first: RunContext, ToolSpec, ToolPolicy, ToolExecutor, AgentToolRegistry, MCP governance, approval stores, and background-session health.
  • Start with the highest-risk tools, then expand to read tools only after the enforcement path is proven.
  • Preserve local-first operation. No external control plane dependency is required for the baseline.
  • Fail closed for changed high-risk contracts when enforcement is enabled, and fail visibly in observe mode.
  • Keep naming SecondBrain-native in code, docs, migrations, tests, and fixtures.

Current Assets To Reuse

Area Existing surface
Kernel tool policy brain/kernel/tooling/permissions.py
Kernel execution boundary brain/kernel/tooling/executor.py
Tool contracts brain/kernel/contracts.py::ToolSpec
Runtime toolsets brain/agent/tools.py, brain/agent/toolsets/
MCP governance brain/mcp/, brain/adapters/mcp/, brain/agent/toolsets/mcp.py
Agent identity and scopes brain/agent_identity/, docs/agent_identity_design.md
Gateway approvals brain/gateway/, brain/agent_runtime/policy.py
Background health brain/background_sessions/health_policy.py
Event evidence brain/state/event_log.py, docs/explanation/observability.md
Policy explanation docs/explanation/policy-and-approvals.md

Gap Summary

Gap Impact
Tool registries are layered but not contract-fingerprinted consistently. A changed tool description or schema can be hard to distinguish from a normal registration.
Agent identity scope enforcement is still opt-in across many entrypoints. A side-effecting path may rely on approval mode without also proving per-agent authority.
Delegation and worker relationships are observable in places, but not governed as an approved topology. Agents can form new call paths without an explicit operator-reviewed edge.
Tool outputs are not uniformly scanned before returning to model context or memory. Secrets, PII, or injected instructions can re-enter the agent loop.
Risk signals exist across health snapshots and functional state, but there is no unified per-agent trust score. Operators see warnings, but policy cannot consistently degrade, pause, or quarantine.
Revocation exists for manifests and approvals, but not as a runtime cascade across active sessions and descendants. A compromised session may require manual cleanup across multiple stores.

Target Architecture

Agent / runtime loop
  -> ActionGuard.before_tool_call()
     - verify agent identity and scopes
     - verify tool contract fingerprint
     - verify approved topology edge
     - check approval, risk, budget, rate, and policy
  -> tool execution
  -> ActionGuard.after_tool_response()
     - scan output before context re-entry
     - update risk score and event evidence
     - apply quarantine or degradation if needed
  -> model context / memory / operator trace

The implementation should be additive. Early phases can run in observe mode and only block the highest-risk tools once the operator has approved the baseline.

Core Building Blocks

1. Tool Contract Fingerprints

Add a stable fingerprint for each registered tool contract:

  • inputs: tool id, name, description, safety class, argument schema, return schema, required permissions, required identity scopes, and risk metadata
  • output: deterministic hash stored in registry snapshots and tool-call events
  • behavior: observe mismatches first, then deny high-risk changed contracts unless an operator approves the new contract

Implementation targets:

  • brain/kernel/contracts.py
  • brain/kernel/tooling/registry.py
  • brain/agent/tools.py
  • brain/agent/toolsets/mcp.py
  • brain/mcp/discovery.py and discovery cache code
  • brain/state/event_log.py metadata fields

Tests:

  • deterministic fingerprint ignores dictionary ordering
  • description/schema/scope changes alter the fingerprint
  • high-risk mismatch is denied in enforce mode
  • observe mode logs mismatch without blocking

2. Shared Action Guard

Introduce one small orchestration object around existing policies rather than moving all logic into a new layer.

Suggested shape:

class ActionGuard:
    def before_tool_call(self, ctx, spec, args, *, call_chain): ...
    def after_tool_response(self, ctx, spec, result, *, call_chain): ...

The guard should call existing policy and approval components first. It should only own cross-cutting checks that span tool registries:

  • contract fingerprint verification
  • agent identity scope verification
  • topology edge verification
  • output re-entry scanning
  • risk-score update
  • quarantine check

Implementation targets:

  • new brain/runtime/action_guard.py or brain/kernel/tooling/action_guard.py
  • brain/kernel/tooling/executor.py
  • brain/agent/tool_executor_v2.py
  • brain/agent_runtime/tools.py
  • brain/mcp/executor.py

Tests:

  • kernel executor still preserves current allow, deny, approval, and budget behavior
  • guard failures emit structured event metadata
  • post-response scanner can redact or block before model re-entry

3. Approved Delegation Topology

Model runtime relationships as directed edges:

  • agent -> tool
  • agent -> subagent
  • agent -> worker session
  • agent -> MCP server/tool
  • automation -> agent

Each edge has a lifecycle:

  1. observed: discovered and logged, no enforcement
  2. approved: accepted by operator or policy
  3. locked: only approved edges are allowed

Implementation targets:

  • new runtime DB tables for topology edges and lock state
  • CLI: sb runtime-guard topology|approve-edge|lock|unlock
  • UI later: operator view beside background sessions and approvals
  • first integration: spawn_subagent records agent -> subagent edges and locked mode denies unapproved child-agent targets
  • later integrations: worker_start, local A2A dispatch, MCP tool calls, and background-session spawning

Tests:

  • direction matters: A -> B does not imply B -> A
  • locked mode denies new unapproved edges
  • observe mode records but allows
  • approved edge survives restart

4. Per-Call Capability Claims

Use existing IdentityClaim and scope narrowing as the foundation. Add a short-lived per-call authorization proof for side-effecting actions:

  • bound to run_id, trace_id, span_id, tool id, contract fingerprint, scopes, approval id if any, and expiry
  • nonces or call ids prevent replay inside a run
  • child sessions receive narrowed claims, never expanded claims

Implementation targets:

  • brain/agent_identity/claim.py
  • brain/agent_identity/runtime.py
  • brain/kernel/run_context.py
  • brain/kernel/tooling/permissions.py
  • gateway and background-session start paths

Tests:

  • expired call proof fails
  • tool id mismatch fails
  • contract fingerprint mismatch fails
  • approval does not add missing identity scopes

5. Context Re-Entry Scanner

Every tool result that may be shown to a model or written to memory should pass through a response inspection step.

Minimum scanner behavior:

  • detect likely secrets, tokens, private keys, and credential-shaped text
  • detect PII classes already represented in privacy audit or memory review
  • detect tool-result prompt injection patterns
  • emit redaction metadata and evidence refs
  • block only high-confidence sensitive data in early enforcement mode

Implementation targets:

  • brain/security/content_safety.py
  • brain/agent/tool_outcomes.py
  • brain/kernel/tooling/executor.py
  • memory proposal and review queue paths
  • MCP resource and prompt fetch paths

Tests:

  • secret-shaped output is redacted before model context
  • blocked output records an inspectable event
  • memory ingestion receives scanner metadata
  • low-confidence findings warn without losing normal output

6. Agent Trust Score

Unify existing risk signals into a per-agent/session trust score with decay.

Inputs:

  • denied tool calls
  • repeated approval requests
  • contract mismatches
  • topology violations
  • scanner findings
  • repeated failed tools
  • budget pressure and health-policy breaches
  • functional-state risk signals

Actions:

  • warn
  • require approval for next privileged action
  • degrade to read-only
  • pause background session
  • quarantine session and descendants

Implementation targets:

  • brain/background_sessions/health_policy.py
  • brain/antahkarana/functional_state/
  • brain/state/event_log.py
  • new risk aggregation helper under brain/runtime/ or brain/background_sessions/

Tests:

  • score decays over time
  • repeated high-risk events cross thresholds deterministically
  • read-only degradation preserves diagnostic tools
  • thresholds are configurable and visible in audit output

7. Quarantine And Revocation Cascade

Add an operator action and policy action that can stop a compromised runtime branch:

  • revoke active session authority
  • pause or cancel descendant worker/background sessions
  • reject future tool calls with the revoked claim or session id
  • keep read-only inspection available for audit
  • record a single cascade event with affected descendants

Implementation targets:

  • brain/agent_identity/
  • brain/background_sessions/store.py
  • brain/background_sessions/runtime.py
  • brain/agent_runtime/policy.py
  • brain/gateway/
  • CLI: sb agent-identity quarantine or sb sessions quarantine

Tests:

  • quarantined parent prevents child tool calls
  • read-only trace inspection still works
  • cascade is idempotent
  • restart preserves quarantine state

First Vertical Slice

Ship the smallest end-to-end path around a high-risk local target:

  1. Add deterministic tool contract fingerprints to kernel ToolSpec.
  2. Log the fingerprint on every kernel tool call.
  3. Add observe/enforce modes to the kernel executor for fingerprint mismatches.
  4. Enable enforcement for one destructive or external-write tool class only.
  5. Add a focused CLI/audit view showing current fingerprints and mismatches.
  6. Add tests proving existing approval behavior is unchanged.

Candidate high-risk tools:

  • external message send tools
  • planner task create/update/delete
  • mcp_call_tool
  • openapi_invoke
  • http_profile_request
  • file write/edit/delete surfaces
  • background run_command
  • travel booking or cancellation actions

Phased Delivery Plan

Phase 0 — Inventory And Baseline

Deliverables:

  • inventory all tool registries and high-risk tools
  • classify tool risk tiers consistently across kernel, runtime toolsets, MCP, connectors, and background sessions
  • add a docs table mapping risk tiers to approval, identity, and scanner expectations

Verification:

  • pytest -q tests/kernel/test_kernel_tooling.py
  • targeted MCP/toolset catalog tests
  • sb privacy audit --json

Phase 1 — Contract Fingerprints

Deliverables:

  • ToolSpec fingerprint helper
  • fingerprint metadata in registry listings and tool-call events
  • observe-mode mismatch events
  • enforce-mode block for selected high-risk tools
  • persisted runtime-database baselines for operator-approved tool contracts
  • sb runtime-guard tools|mismatches|snapshot|approve-contract for current contract inspection and approval
  • optional execution enforcement that denies high-risk contracts when the persisted approved baseline is missing, changed, or unavailable

Verification:

  • new kernel tests for fingerprint determinism and mismatch behavior
  • existing kernel tooling tests unchanged
  • runtime baseline store tests and runtime-guard CLI integration tests

Phase 2 — Side-Effect Action Guard

Deliverables:

  • shared ActionGuard wrapper around existing policy decisions
  • kernel executor integration
  • one agent toolset integration
  • one MCP tool-call integration

Verification:

  • approval, denial, and budget tests continue to pass
  • new tests cover guard metadata and failure surfaces

Phase 3 — Identity Required For High-Risk Tools

Deliverables:

  • default identity minting for selected entrypoints that call external-write or destructive tools
  • identity_required=True on high-risk execution paths
  • per-call proof design using existing claim primitives

Verification:

  • missing scopes deny high-risk calls
  • approval does not override missing identity scope
  • child claims are narrowed across worker/subagent boundaries

Phase 4 — Observed, Approved, Locked Topology

Deliverables:

  • topology edge store
  • observe/approve/lock CLI
  • edge recording for worker starts, local A2A, MCP calls, and selected toolsets
  • locked-mode enforcement

Verification:

  • directed edge tests
  • restart persistence tests
  • audit output shows unapproved observed edges

Phase 5 — Context Re-Entry Scanner

Deliverables:

  • scanner integrated after tool responses and before memory proposals
  • redaction/block metadata in events
  • scanner findings reflected in health snapshots

Verification:

  • secret and injection fixtures are redacted or blocked
  • memory review receives scanner evidence
  • normal low-risk outputs are unchanged

Phase 6 — Trust Score And Quarantine

Deliverables:

  • per-agent/session score with decay
  • thresholds connected to warn, approval-required, read-only degrade, pause, and quarantine actions
  • cascade revocation for child sessions
  • operator inspection command

Verification:

  • deterministic score tests
  • quarantine idempotency tests
  • background-session health tests
  • gateway/session restart tests

Operator UX

Minimum CLI surface:

sb runtime-guard tools
sb runtime-guard mismatches
sb runtime-guard snapshot
sb runtime-guard approve-contract <surface> <tool-name>
sb runtime-guard topology
sb runtime-guard approve-edge <edge-id>
sb runtime-guard lock <scope>
sb runtime-guard unlock <scope>
sb runtime-guard risk <session-id>
sb sessions quarantine <session-id>

The UI can come after the data model stabilizes. It should reuse the existing Agent Cockpit, approvals, sessions, and health surfaces rather than adding a separate control plane.

Testing Strategy

Add tests at the lowest layer that owns each behavior:

  • kernel: fingerprints, identity scope checks, approval preservation
  • agent toolsets: catalog exposure and side-effect classification
  • MCP: discovery hash, changed schema handling, approval reuse interaction
  • background sessions: topology edges, trust score, pause/quarantine
  • serve/UI contract: only if new stream events are added

Run targeted loops first, then broader checks before merge:

pytest -q tests/kernel/test_kernel_tooling.py
pytest -q tests/tools tests/agent tests/runtime
ruff format --check .
ruff check .

Definition Of Done

  • High-risk tool calls have a visible contract fingerprint.
  • Changed high-risk contracts can be observed, approved, or denied.
  • Side-effecting calls can require a signed, scoped agent identity.
  • Delegation edges are observable and lockable by direction.
  • Tool outputs that re-enter context or memory pass through scanner metadata.
  • Risk accumulation can deterministically degrade or pause a session.
  • Quarantine blocks descendants while preserving audit inspection.
  • Docs, help text, and audit surfaces explain the active posture.