Governed Action Runtime Improvement Plan¶

Purpose¶

SecondBrain already has strong local-first building blocks: kernel tool policies, approval gates, agent identity, MCP governance, background-session health, event logs, and privacy audits. The next improvement is to turn those pieces into a single governed action runtime: the model can reason freely, but every side-effecting action must pass a deterministic runtime boundary before it executes, and every result must be safe to re-enter context or memory before the model sees it again.

The goal is not a new agent loop. The goal is a shared enforcement layer that existing loops can call from the kernel, chat harness, gateway runtime, background sessions, MCP adapters, and connector-backed tools.

Design Principles¶

Keep governance deterministic. Tool routing, fingerprints, topology locks, scopes, risk thresholds, and quarantine decisions must be inspectable policy, not model judgment.
Extend existing surfaces first: RunContext, ToolSpec, ToolPolicy, ToolExecutor, AgentToolRegistry, MCP governance, approval stores, and background-session health.
Start with the highest-risk tools, then expand to read tools only after the enforcement path is proven.
Preserve local-first operation. No external control plane dependency is required for the baseline.
Fail closed for changed high-risk contracts when enforcement is enabled, and fail visibly in observe mode.
Keep naming SecondBrain-native in code, docs, migrations, tests, and fixtures.

Current Assets To Reuse¶

Area	Existing surface
Kernel tool policy	`brain/kernel/tooling/permissions.py`
Kernel execution boundary	`brain/kernel/tooling/executor.py`
Tool contracts	`brain/kernel/contracts.py::ToolSpec`
Runtime toolsets	`brain/agent/tools.py`, `brain/agent/toolsets/`
MCP governance	`brain/mcp/`, `brain/adapters/mcp/`, `brain/agent/toolsets/mcp.py`
Agent identity and scopes	`brain/agent_identity/`, `docs/agent_identity_design.md`
Gateway approvals	`brain/gateway/`, `brain/agent_runtime/policy.py`
Background health	`brain/background_sessions/health_policy.py`
Event evidence	`brain/state/event_log.py`, `docs/explanation/observability.md`
Policy explanation	`docs/explanation/policy-and-approvals.md`

Gap Summary¶

Gap	Impact
Tool registries are layered but not contract-fingerprinted consistently.	A changed tool description or schema can be hard to distinguish from a normal registration.
Agent identity scope enforcement is still opt-in across many entrypoints.	A side-effecting path may rely on approval mode without also proving per-agent authority.
Delegation and worker relationships are observable in places, but not governed as an approved topology.	Agents can form new call paths without an explicit operator-reviewed edge.
Tool outputs are not uniformly scanned before returning to model context or memory.	Secrets, PII, or injected instructions can re-enter the agent loop.
Risk signals exist across health snapshots and functional state, but there is no unified per-agent trust score.	Operators see warnings, but policy cannot consistently degrade, pause, or quarantine.
Revocation exists for manifests and approvals, but not as a runtime cascade across active sessions and descendants.	A compromised session may require manual cleanup across multiple stores.

Target Architecture¶

Agent / runtime loop
  -> ActionGuard.before_tool_call()
     - verify agent identity and scopes
     - verify tool contract fingerprint
     - verify approved topology edge
     - check approval, risk, budget, rate, and policy
  -> tool execution
  -> ActionGuard.after_tool_response()
     - scan output before context re-entry
     - update risk score and event evidence
     - apply quarantine or degradation if needed
  -> model context / memory / operator trace

The implementation should be additive. Early phases can run in observe mode and only block the highest-risk tools once the operator has approved the baseline.

Core Building Blocks¶

1. Tool Contract Fingerprints¶

Add a stable fingerprint for each registered tool contract:

inputs: tool id, name, description, safety class, argument schema, return schema, required permissions, required identity scopes, and risk metadata
output: deterministic hash stored in registry snapshots and tool-call events
behavior: observe mismatches first, then deny high-risk changed contracts unless an operator approves the new contract

Implementation targets:

brain/kernel/contracts.py
brain/kernel/tooling/registry.py
brain/agent/tools.py
brain/agent/toolsets/mcp.py
brain/mcp/discovery.py and discovery cache code
brain/state/event_log.py metadata fields

Tests:

deterministic fingerprint ignores dictionary ordering
description/schema/scope changes alter the fingerprint
high-risk mismatch is denied in enforce mode
observe mode logs mismatch without blocking

2. Shared Action Guard¶

Introduce one small orchestration object around existing policies rather than moving all logic into a new layer.

Suggested shape:

class ActionGuard:
    def before_tool_call(self, ctx, spec, args, *, call_chain): ...
    def after_tool_response(self, ctx, spec, result, *, call_chain): ...

The guard should call existing policy and approval components first. It should only own cross-cutting checks that span tool registries:

contract fingerprint verification
agent identity scope verification
topology edge verification
output re-entry scanning
risk-score update
quarantine check

Implementation targets:

new brain/runtime/action_guard.py or brain/kernel/tooling/action_guard.py
brain/kernel/tooling/executor.py
brain/agent/tool_executor_v2.py
brain/agent_runtime/tools.py
brain/mcp/executor.py

Tests:

kernel executor still preserves current allow, deny, approval, and budget behavior
guard failures emit structured event metadata
post-response scanner can redact or block before model re-entry

3. Approved Delegation Topology¶

Model runtime relationships as directed edges:

agent -> tool
agent -> subagent
agent -> worker session
agent -> MCP server/tool
automation -> agent

Each edge has a lifecycle:

observed: discovered and logged, no enforcement
approved: accepted by operator or policy
locked: only approved edges are allowed

Implementation targets:

new runtime DB tables for topology edges and lock state
CLI: sb runtime-guard topology|approve-edge|lock|unlock
UI later: operator view beside background sessions and approvals
first integration: spawn_subagent records agent -> subagent edges and locked mode denies unapproved child-agent targets
later integrations: worker_start, local A2A dispatch, MCP tool calls, and background-session spawning

Tests:

direction matters: A -> B does not imply B -> A
locked mode denies new unapproved edges
observe mode records but allows
approved edge survives restart

4. Per-Call Capability Claims¶

Use existing IdentityClaim and scope narrowing as the foundation. Add a short-lived per-call authorization proof for side-effecting actions:

bound to run_id, trace_id, span_id, tool id, contract fingerprint, scopes, approval id if any, and expiry
nonces or call ids prevent replay inside a run
child sessions receive narrowed claims, never expanded claims

Implementation targets:

brain/agent_identity/claim.py
brain/agent_identity/runtime.py
brain/kernel/run_context.py
brain/kernel/tooling/permissions.py
gateway and background-session start paths

Tests:

expired call proof fails
tool id mismatch fails
contract fingerprint mismatch fails
approval does not add missing identity scopes

5. Context Re-Entry Scanner¶

Every tool result that may be shown to a model or written to memory should pass through a response inspection step.

Minimum scanner behavior:

detect likely secrets, tokens, private keys, and credential-shaped text
detect PII classes already represented in privacy audit or memory review
detect tool-result prompt injection patterns
emit redaction metadata and evidence refs
block only high-confidence sensitive data in early enforcement mode

Implementation targets:

brain/security/content_safety.py
brain/agent/tool_outcomes.py
brain/kernel/tooling/executor.py
memory proposal and review queue paths
MCP resource and prompt fetch paths

Tests:

secret-shaped output is redacted before model context
blocked output records an inspectable event
memory ingestion receives scanner metadata
low-confidence findings warn without losing normal output

6. Agent Trust Score¶

Unify existing risk signals into a per-agent/session trust score with decay.

Inputs:

denied tool calls
repeated approval requests
contract mismatches
topology violations
scanner findings
repeated failed tools
budget pressure and health-policy breaches
functional-state risk signals

Actions:

warn
require approval for next privileged action
degrade to read-only
pause background session
quarantine session and descendants

Implementation targets:

brain/background_sessions/health_policy.py
brain/antahkarana/functional_state/
brain/state/event_log.py
new risk aggregation helper under brain/runtime/ or brain/background_sessions/

Tests:

score decays over time
repeated high-risk events cross thresholds deterministically
read-only degradation preserves diagnostic tools
thresholds are configurable and visible in audit output

7. Quarantine And Revocation Cascade¶

Add an operator action and policy action that can stop a compromised runtime branch:

revoke active session authority
pause or cancel descendant worker/background sessions
reject future tool calls with the revoked claim or session id
keep read-only inspection available for audit
record a single cascade event with affected descendants

Implementation targets:

brain/agent_identity/
brain/background_sessions/store.py
brain/background_sessions/runtime.py
brain/agent_runtime/policy.py
brain/gateway/
CLI: sb agent-identity quarantine or sb sessions quarantine

Tests:

quarantined parent prevents child tool calls
read-only trace inspection still works
cascade is idempotent
restart preserves quarantine state

First Vertical Slice¶

Ship the smallest end-to-end path around a high-risk local target:

Add deterministic tool contract fingerprints to kernel ToolSpec.
Log the fingerprint on every kernel tool call.
Add observe/enforce modes to the kernel executor for fingerprint mismatches.
Enable enforcement for one destructive or external-write tool class only.
Add a focused CLI/audit view showing current fingerprints and mismatches.
Add tests proving existing approval behavior is unchanged.

Candidate high-risk tools:

external message send tools
planner task create/update/delete
mcp_call_tool
openapi_invoke
http_profile_request
file write/edit/delete surfaces
background run_command
travel booking or cancellation actions

Phased Delivery Plan¶

Phase 0 — Inventory And Baseline¶

Deliverables:

inventory all tool registries and high-risk tools
classify tool risk tiers consistently across kernel, runtime toolsets, MCP, connectors, and background sessions
add a docs table mapping risk tiers to approval, identity, and scanner expectations

Verification:

pytest -q tests/kernel/test_kernel_tooling.py
targeted MCP/toolset catalog tests
sb privacy audit --json

Phase 1 — Contract Fingerprints¶

Deliverables:

ToolSpec fingerprint helper
fingerprint metadata in registry listings and tool-call events
observe-mode mismatch events
enforce-mode block for selected high-risk tools
persisted runtime-database baselines for operator-approved tool contracts
sb runtime-guard tools|mismatches|snapshot|approve-contract for current contract inspection and approval
optional execution enforcement that denies high-risk contracts when the persisted approved baseline is missing, changed, or unavailable

Verification:

new kernel tests for fingerprint determinism and mismatch behavior
existing kernel tooling tests unchanged
runtime baseline store tests and runtime-guard CLI integration tests

Phase 2 — Side-Effect Action Guard¶

Deliverables:

shared ActionGuard wrapper around existing policy decisions
kernel executor integration
one agent toolset integration
one MCP tool-call integration

Verification:

approval, denial, and budget tests continue to pass
new tests cover guard metadata and failure surfaces

Phase 3 — Identity Required For High-Risk Tools¶

Deliverables:

default identity minting for selected entrypoints that call external-write or destructive tools
identity_required=True on high-risk execution paths
per-call proof design using existing claim primitives

Verification:

missing scopes deny high-risk calls
approval does not override missing identity scope
child claims are narrowed across worker/subagent boundaries

Phase 4 — Observed, Approved, Locked Topology¶

Deliverables:

topology edge store
observe/approve/lock CLI
edge recording for worker starts, local A2A, MCP calls, and selected toolsets
locked-mode enforcement

Verification:

directed edge tests
restart persistence tests
audit output shows unapproved observed edges

Phase 5 — Context Re-Entry Scanner¶

Deliverables:

scanner integrated after tool responses and before memory proposals
redaction/block metadata in events
scanner findings reflected in health snapshots

Verification:

secret and injection fixtures are redacted or blocked
memory review receives scanner evidence
normal low-risk outputs are unchanged

Phase 6 — Trust Score And Quarantine¶

Deliverables:

per-agent/session score with decay
thresholds connected to warn, approval-required, read-only degrade, pause, and quarantine actions
cascade revocation for child sessions
operator inspection command

Verification:

deterministic score tests
quarantine idempotency tests
background-session health tests
gateway/session restart tests

Operator UX¶

Minimum CLI surface:

sb runtime-guard tools
sb runtime-guard mismatches
sb runtime-guard snapshot
sb runtime-guard approve-contract <surface> <tool-name>
sb runtime-guard topology
sb runtime-guard approve-edge <edge-id>
sb runtime-guard lock <scope>
sb runtime-guard unlock <scope>
sb runtime-guard risk <session-id>
sb sessions quarantine <session-id>

The UI can come after the data model stabilizes. It should reuse the existing Agent Cockpit, approvals, sessions, and health surfaces rather than adding a separate control plane.

Testing Strategy¶

Add tests at the lowest layer that owns each behavior:

kernel: fingerprints, identity scope checks, approval preservation
agent toolsets: catalog exposure and side-effect classification
MCP: discovery hash, changed schema handling, approval reuse interaction
background sessions: topology edges, trust score, pause/quarantine
serve/UI contract: only if new stream events are added

Run targeted loops first, then broader checks before merge:

pytest -q tests/kernel/test_kernel_tooling.py
pytest -q tests/tools tests/agent tests/runtime
ruff format --check .
ruff check .

Definition Of Done¶

High-risk tool calls have a visible contract fingerprint.
Changed high-risk contracts can be observed, approved, or denied.
Side-effecting calls can require a signed, scoped agent identity.
Delegation edges are observable and lockable by direction.
Tool outputs that re-enter context or memory pass through scanner metadata.
Risk accumulation can deterministically degrade or pause a session.
Quarantine blocks descendants while preserving audit inspection.
Docs, help text, and audit surfaces explain the active posture.

Governed Action Runtime Improvement Plan¶

Purpose¶

Design Principles¶

Current Assets To Reuse¶

Gap Summary¶

Target Architecture¶

Core Building Blocks¶

1. Tool Contract Fingerprints¶

2. Shared Action Guard¶

3. Approved Delegation Topology¶

4. Per-Call Capability Claims¶

5. Context Re-Entry Scanner¶

6. Agent Trust Score¶

7. Quarantine And Revocation Cascade¶

First Vertical Slice¶

Phased Delivery Plan¶

Phase 0 — Inventory And Baseline¶

Phase 1 — Contract Fingerprints¶

Phase 2 — Side-Effect Action Guard¶

Phase 3 — Identity Required For High-Risk Tools¶

Phase 4 — Observed, Approved, Locked Topology¶

Phase 5 — Context Re-Entry Scanner¶

Phase 6 — Trust Score And Quarantine¶

Operator UX¶

Testing Strategy¶

Definition Of Done¶

Related Docs¶