Agent Harness Claude Code Learnings¶
Current State¶
SecondBrain already had several pieces of a governed local agent runtime:
brain/agent/,brain/agent_runtime/, andbrain/tasks/provide planning, tool execution, policies, traces, and task graphs.brain/context_packs/andbrain/retrieval/provide context compilation and promoted-knowledge-first retrieval.brain/policies/,brain/orchestrator/, and approval stores provide permission decisions and approval gates.brain/background_sessions/provides durable session state, checkpoints, bridge sessions, and resumable runs.brain/skills/andcontracts/registry/agent_profiles.yamlprovide local skill discovery and agent profiles.
Gaps Closed In This Rollout¶
- Added canonical agent-run contracts under
brain/agent_harness/for task runs, context budgets, permission decisions, checkpoints, verification plans/results, run traces, and skill manifest exports. - Exported matching runtime schemas under
contracts/runtime/. - Added deterministic verification plans with command, grep, file existence, JSON schema, and richer artifact validation checks for rendered UI/HTML, screenshots/images, JSON artifacts, documents, decks, and generic artifacts.
- Added durable file checkpoint create/list/diff/rewind support backed by JSON manifests and copied blobs, independent of Git.
- Added default deterministic kernel hooks for secret scanning, destructive-command blocking, and provenance checks on memory writes.
- Added Claude-Code-style permission aliases:
plan_only,auto_read,auto_edit_local,auto_safe, anddangerous_requires_approval. - Added top-level operator commands:
sb verify,sb checkpoint,sb rewind,sb checkpoints list,sb checkpoints diff, andsb act. - Extended
sb diff <checkpoint_id>to diff current files against a durable checkpoint while preserving the existing Git diff behavior when no checkpoint is supplied. - Added
sb task start "..." --worktree/sb tasks startto create a task graph and optionally prepare an isolated workspace. - Added
sb skills run <skill> "..."to produce a skill-scoped agent prompt package. - Added
sb context explain <run_id>and expandedsb context doctorwith estimated token usage, top token sources, duplicate context, low-value context, stale memories, and pruning recommendations. - Added
sb context doctor --write-rank-hintsso pruning diagnostics can feed retrieval rank hints instead of only reporting budget pressure. - Added
sb context doctor --cleanup-rank-hintsas a dry-run-first cleanup path for resolved managed retrieval hints, with applied cleanup archiving removed hints instead of deleting source content. - Added bundled skills for repo maintenance, context refining, Codex prompting, and artifact generation.
- Added specialist profiles for
repo-scanner,security-reviewer, andmemory-curator. - Added automatic durable pre-mutation checkpoints for chat code writes, gateway
write_file, and vault markdown writes. - Added file-backed
AgentRunTracepersistence forsb verify,sb checkpoint,sb rewind,sb plan,sb task start,sb tasks run,sb act, andsb tasks resumeoperator actions. - Added stable
AgentRunTracereceipts for background-session lifecycle checkpoints, including queued, started, approval, retry, step, completed, failed, paused, cancelled, and operator-expired states. - Added stable
AgentRunTracereceipts for interactive serve-chat turns, including non-stream, streaming, slash-command, approval-resume, failure, and cancellation paths. - Added terminal REPL
AgentRunTracereceipts throughChatTransport, preserving turn journal token, fallback, tool, and terminal status details. - Added opt-in direct
AgentHarness.run_turnAgentRunTracereceipts for non-transport surfaces such assb chat --print, chat subagents, local agent-builder runs, spawn-tool subagents, and simulations. - Added opt-in
AgentRunTracereceipts for prompt autotune/eval helper runs when a trace state directory is supplied. - Added
sb traces listandsb traces show <trace_id>so durable agent-run receipts can be inspected directly from the CLI. - Added normalized
subagent.result.v1envelopes with compressed findings, deduped sources, confidence, verification status, and raw-output truncation markers while preserving legacy subagent output. - Added
sb tasks workspace status [task_graph_id]to audit workspace health, git cleanliness, changed files, blockers, and merge readiness across task worktrees. - Added
sb tasks workspace promote <task_graph_id>as a dry-run-first,--apply --yes-guarded fast-forward promotion path for clean candidate task worktrees, with candidate commit metadata and optional reviewer gating. - Added narrow non-file state checkpoints for explicit SQLite rows and keyed SQLite tables, including diff, rewind, and trace receipts.
Minimal Implementation Shape¶
The implementation is additive. Existing sb chat, task planning, background sessions, and policy flows remain compatible. New contracts and commands sit beside the existing runtime so the system can migrate gradually toward the standard lifecycle:
Deterministic task graphs now synthesize artifact validation checks when completed step outputs or declared output metadata are available. sb verify still treats truly empty plans as unverified instead of claiming success.
Rollout Plan¶
- Keep the current task graph and background-session stores as the source of operational truth.
- Use the new agent-run contracts as portable read/write envelopes at runtime boundaries.
- Attach richer verification plans to additional domain-specific task templates as those surfaces expose artifact metadata.
- Continue expanding checkpoint hooks from known file-mutation tools into broader task execution loops where state restore semantics are explicit.
- Keep promoting subagent outputs through compressed findings only, with explicit source paths and verification status.
- Use context-budget rank hints as the first retrieval-feedback path, then broaden into memory-curation workflows.
Remaining Follow-Up¶
- No open implementation gap is tracked for this rollout. Future expansion should stay conservative: add new checkpoint scopes, artifact validators, or context-pruning actions only when the affected state has explicit restore or audit semantics.