Agent Harness Comparison And Improvement Plan¶
Added 2026-04-22. This is an implementation-aligned comparison between SecondBrain and a reference harness pattern assembled from direct source inspection. It is intended as an engineering reference, not a product claim sheet.
Purpose¶
- compare a reference harness pattern against the current SecondBrain implementation
- identify where SecondBrain already has parity or stronger local primitives
- identify concrete gaps worth improving inside this repo
- avoid vague "agent platform" language and stay tied to code paths
Scope¶
The comparison covers:
- session resume and history
- context compaction
- memory discovery and injection
- permissions and approvals
- pre/post tool hooks
- sandbox and runtime isolation
- background task lifecycle
- subagent and swarm coordination
- task-manager surfaces
- channel and gateway routing
- stream event contracts
Side-By-Side Matrix¶
| Concept | Reference harness pattern | SecondBrain | Current assessment |
|---|---|---|---|
| Session resume and history | Persists per-project session snapshots plus latest-session state, including carry-forward metadata for long sessions and resumed operators. | Persists chat sessions, messages, and structured session events in SQLite, with selective resume, resume cursors, and branch-from-checkpoint flows in CLI and serve surfaces. Main references: brain/chat/session_store.py, brain/cli/chat.py, brain/serve/routers/chat.py, brain/serve/routers/sessions.py. |
SecondBrain now has durable replay plus selective resume from checkpoints, approvals, or arbitrary session events, including branch creation for both chat and background work. |
| Context compaction | Uses a richer long-session compaction pipeline: microcompact, continuity-style carry-forward compaction, full LLM compaction, and UI-visible compaction phases. | Has budget-aware warning and compaction logic in the harness with streamed warning and compaction events, persisted continuity checkpoints, resume cursors, and UI-visible preserved-state summaries. Main references: brain/context/compaction.py, brain/agent/harness.py, brain/chat/session_store.py, brain/serve/chat_runtime.py. |
SecondBrain now persists continuity checkpoints with enough state to inspect and branch from compaction boundaries. |
| Memory discovery and injection | Pulls repo-local instruction files, memory files, local rules, and session-derived personalization into the prompt layer. | Injects memory retrieval into the harness with semantic and FTS retrieval, token budgeting, and retrieval summaries. Main references: brain/memory/retriever.py, brain/agent/harness.py, brain/serve/payloads.py. |
SecondBrain is strong on retrieval quality and memory budgeting, but weaker on repo-local instruction discovery and explicit write-back from session behavior into durable operator rules. |
| Permissions and approvals | Uses path rules, deny patterns, permission modes, and interactive approval UIs. | Uses kernel tool permissions, runtime approval policies, gateway approval persistence, and streamed approval events. Main references: brain/kernel/tooling/permissions.py, brain/agent_runtime/policy.py, brain/serve/chat_runtime.py, brain/gateway/gateway.py. |
SecondBrain now exposes a canonical policy_decision event family across live chat and background sessions, with approval-required and approval-applied decisions tied back to the same stream contract. |
| Pre/post tool hooks | Has explicit lifecycle hook definitions, loaders, hot reload, and hook executors for command, HTTP, prompt, and agent validators. | Has executor-level hook support and harness callback interception before and after tool execution, now surfaced as persisted hook_event stream entries and rendered in the web inspector. Main references: brain/kernel/hooks.py, brain/kernel/tooling/executor.py, brain/agent/callbacks.py, brain/serve/chat_runtime.py. |
SecondBrain now has an operator-visible hook trace instead of callback-only internals. |
| Sandbox and runtime isolation | Supports a long-lived containerized sandbox session, path validation, and command wrapping. | Supports sandbox routing, bounded subprocess execution, git worktrees, and prepared task workspaces. Main references: brain/sandbox/router.py, brain/workflows/executor.py, brain/agent/worktree.py, brain/background_sessions/runtime.py. |
SecondBrain still distinguishes workspace isolation from hard sandboxing, but sandbox selection and fallback are now explicit operator-visible runtime signals instead of hidden implementation detail. |
| Background task lifecycle | Has an explicit background task manager for shell and agent tasks, output logs, restartability, stop, list, and output-tail operations. | Has durable background sessions with status, heartbeat, checkpoints, resume, cancel, approval pause, and SSE replay. Main references: brain/background_sessions/models.py, brain/background_sessions/runtime.py, brain/background_sessions/store.py, brain/serve/routers/sessions.py. |
SecondBrain is stronger on persisted background session state and checkpoint streaming. The reference pattern is stronger on task-manager ergonomics for lightweight spawned workers. |
| Subagent and swarm coordination | Has a more explicit swarm model with teams, mailboxes, worktree isolation, and background-agent task registration. | Has bounded subagent spawning, lane-based concurrency, team-style orchestration, and A2A message plumbing. Main references: brain/agent/subagent.py, brain/agent/subagent_store.py, brain/agent/lanes.py, brain/agent/tools.py, brain/a2a/*. |
SecondBrain now persists delegated-work snapshots across process boundaries, streams subagent lifecycle events, and exposes those records in serve session detail payloads and the web chat inspector. |
| Task-manager surfaces | Exposes task create, list, inspect, output, stop, and send-message flows through tools and UI. | Exposes task planning, execution, approval, and workflow state through CLI, slash commands, stores, and web pages. Main references: brain/cli/tasks.py, brain/chat/commands/tasks.py, brain/tasks/*, serve-ui/src/pages/ActionsPage.tsx, serve-ui/src/pages/OperationsPage.tsx. |
SecondBrain now includes a unified operator-facing background-session surface inside the existing operations UI, with launch, inspect, resume, cancel, checkpoint review, and branch-from-resume controls tied to durable session state. |
| Channel and gateway routing | Uses a channel bus and thread/session-key routing into long-lived channel runtimes. | Uses gateway envelopes, routing, lanes, and channel adapters tied to shared session state. Main references: brain/gateway/*, brain/channels/*. |
SecondBrain already has strong parity here and a more explicit governed runtime posture. |
| Stream event contract | Uses typed stream events for text deltas, tool execution, compact progress, and UI protocol updates. | Uses an implementation-aligned SSE contract with canonical event types and a matching reducer in the web UI. Main references: brain/serve/chat_runtime.py, docs/specs/agent-harness-frontend-contract.md, serve-ui/src/lib/chat.ts. |
SecondBrain now carries transport, policy, hook, sandbox, and subagent diagnostics inside the same tested canonical event contract. |
Where SecondBrain Is Already Strong¶
- durable approvals and gateway governance are more explicit and audit-friendly
- background sessions already persist richer state than a simple task log
- the serve/UI stream contract is documented and tested as a first-class interface
- worktree and workspace preparation are more integrated with repo-native tasks
- retrieval, context packs, and memory token budgeting are stronger than a simple prompt-assembly layer
Closed Gaps In This Implementation¶
1. Long-Session Continuity¶
SecondBrain now persists explicit continuity checkpoints instead of treating compaction as only a local turn-time budget reaction.
Delivered changes:
- persisted
session_checkpointcontinuity records withsummary,preserved_messages,tokens_saved, andresume_cursor - added user-visible resume points in session detail payloads
- added branch-from-checkpoint/event APIs for both chat and background sessions
- seeded branch sessions with synthetic continuity context instead of silently replaying raw history
Relevant code:
brain/agent/harness.pybrain/chat/session_store.pybrain/serve/chat_runtime.pybrain/serve/routers/chat.pybrain/serve/routers/sessions.py
2. Delegated Work Control Plane¶
SecondBrain now persists and surfaces delegated work as a durable operator object instead of only as in-process subagent state.
Delivered changes:
- added runtime-backed
subagent_runspersistence - persisted subagent lifecycle snapshots across spawn, completion, failure, and cancel
- streamed
subagent_eventinto the same session event contract as chat/background work - exposed subagent state in session detail payloads and the web chat inspector
Relevant code:
brain/agent/subagent.pybrain/agent/subagent_store.pybrain/agent/tools.pybrain/serve/payloads.pyserve-ui/src/pages/ChatPage.tsx
3. Unified Policy Surface¶
SecondBrain now exposes one canonical operator-facing policy event family even though underlying permission and approval primitives still live in separate modules.
Delivered changes:
- standardized
policy_decisionpayloads across live chat and background flows - emitted policy configuration, approval-required, and approval-applied decisions through the canonical stream contract
- persisted the same event family to structured session events for replay and inspection
Relevant code:
brain/agent_runtime/policy.pybrain/serve/chat_runtime.pybrain/background_sessions/runtime.py
4. Hook Observability¶
SecondBrain now emits hook lifecycle traces into the runtime stream and session event ledger.
Delivered changes:
- emitted structured
hook_eventrows for pre- and post-tool phases - persisted hook traces to
session_events - rendered hook diagnostics in the web chat inspector
Relevant code:
brain/kernel/tooling/executor.pybrain/agent/callbacks.pybrain/serve/chat_runtime.pyserve-ui/src/lib/chat.ts
5. Stronger Isolation Defaults¶
SecondBrain now makes isolation and fallback behavior explicit to the operator instead of leaving it implicit in runtime selection logic.
Delivered changes:
- emitted
sandbox_noticefor live chat and background sessions - surfaced when a run used a prepared workspace versus best-effort local execution
- carried sandbox posture into
session_stateand session detail payloads
Relevant code:
brain/background_sessions/runtime.pybrain/serve/chat_runtime.pyserve-ui/src/pages/ChatPage.tsx
6. Transport Diagnostics In The Main Stream¶
SecondBrain now treats degraded transport/provider behavior as a first-class runtime event family.
Delivered changes:
- emitted
transport_diagnosticthroughSTREAM_EVENT_TYPES - persisted transport diagnostics beside tool, approval, and completion events
- rendered transport diagnostics in the web timeline and diagnostic inspector
Relevant code:
brain/chat/transport.pybrain/serve/chat_runtime.pyserve-ui/src/lib/chat.ts
7. Operator Surface Consolidation¶
SecondBrain now closes the remaining ergonomic gap by exposing durable background-session control directly in the existing operations UI.
Delivered changes:
- added a
Background Sessionsmode to the operations page - surfaced launch, inspect, resume, cancel, and branch-from-resume flows in one place
- rendered checkpoints, resume points, and delegated-work snapshots alongside the session runtime record
- reused the existing
/sessionsAPIs instead of introducing a parallel control plane
Relevant code:
brain/serve/routers/sessions.pyserve-ui/src/lib/api.tsserve-ui/src/pages/OperationsPage.tsxserve-ui/src/pages/OperationsPage.test.tsx
Delivery Order¶
Implemented in this change set:
- delegated work persistence and lifecycle streaming
- transport, policy, hook, and sandbox event unification
- long-session continuity checkpoints and branchable resume
- serve session detail payload expansion for resume points and subagents
- web inspector support for the new runtime diagnostics
- unified background-session operator controls in the operations UI
- targeted stream, background-session, subagent, and reducer test coverage
Change Since This Comparison¶
The implementation work described above has now landed. SecondBrain now includes:
- richer subagent snapshots plus runtime persistence in
brain/agent/subagent.pyandbrain/agent/subagent_store.py list_subagentsandcancel_subagentinbrain/agent/tools.py- continuity-aware compaction checkpoints and branchable resume in
brain/chat/session_store.py - canonical
policy_decision,hook_event,sandbox_notice,transport_diagnostic, andsubagent_eventstream families inbrain/serve/chat_runtime.py - expanded serve session detail payloads for
resume_pointsandsubagentsinbrain/serve/payloads.py - frontend reducer and inspector support for those runtime diagnostics in
serve-ui/src/lib/chat.tsandserve-ui/src/pages/ChatPage.tsx - a unified
Background Sessionsoperator surface inserve-ui/src/pages/OperationsPage.tsx - targeted verification in
tests/infra/test_serve_stream_contract.py,tests/infra/test_background_sessions.py,tests/infra/test_serve_api.py,tests/agent/test_subagent_registry.py, andserve-ui/src/lib/chat.test.ts
The original gaps in this comparison are now closed as shipped implementation work rather than remaining as planning-only notes.