Agent Harness Comparison And Improvement Plan¶

Added 2026-04-22. This is an implementation-aligned comparison between SecondBrain and a reference harness pattern assembled from direct source inspection. It is intended as an engineering reference, not a product claim sheet.

Purpose¶

compare a reference harness pattern against the current SecondBrain implementation
identify where SecondBrain already has parity or stronger local primitives
identify concrete gaps worth improving inside this repo
avoid vague "agent platform" language and stay tied to code paths

Scope¶

The comparison covers:

session resume and history
context compaction
memory discovery and injection
permissions and approvals
pre/post tool hooks
sandbox and runtime isolation
background task lifecycle
subagent and swarm coordination
task-manager surfaces
channel and gateway routing
stream event contracts

Side-By-Side Matrix¶

Concept	Reference harness pattern	SecondBrain	Current assessment
Session resume and history	Persists per-project session snapshots plus latest-session state, including carry-forward metadata for long sessions and resumed operators.	Persists chat sessions, messages, and structured session events in SQLite, with selective resume, resume cursors, and branch-from-checkpoint flows in CLI and serve surfaces. Main references: `brain/chat/session_store.py`, `brain/cli/chat.py`, `brain/serve/routers/chat.py`, `brain/serve/routers/sessions.py`.	SecondBrain now has durable replay plus selective resume from checkpoints, approvals, or arbitrary session events, including branch creation for both chat and background work.
Context compaction	Uses a richer long-session compaction pipeline: microcompact, continuity-style carry-forward compaction, full LLM compaction, and UI-visible compaction phases.	Has budget-aware warning and compaction logic in the harness with streamed warning and compaction events, persisted continuity checkpoints, resume cursors, and UI-visible preserved-state summaries. Main references: `brain/context/compaction.py`, `brain/agent/harness.py`, `brain/chat/session_store.py`, `brain/serve/chat_runtime.py`.	SecondBrain now persists continuity checkpoints with enough state to inspect and branch from compaction boundaries.
Memory discovery and injection	Pulls repo-local instruction files, memory files, local rules, and session-derived personalization into the prompt layer.	Injects memory retrieval into the harness with semantic and FTS retrieval, token budgeting, and retrieval summaries. Main references: `brain/memory/retriever.py`, `brain/agent/harness.py`, `brain/serve/payloads.py`.	SecondBrain is strong on retrieval quality and memory budgeting, but weaker on repo-local instruction discovery and explicit write-back from session behavior into durable operator rules.
Permissions and approvals	Uses path rules, deny patterns, permission modes, and interactive approval UIs.	Uses kernel tool permissions, runtime approval policies, gateway approval persistence, and streamed approval events. Main references: `brain/kernel/tooling/permissions.py`, `brain/agent_runtime/policy.py`, `brain/serve/chat_runtime.py`, `brain/gateway/gateway.py`.	SecondBrain now exposes a canonical `policy_decision` event family across live chat and background sessions, with approval-required and approval-applied decisions tied back to the same stream contract.
Pre/post tool hooks	Has explicit lifecycle hook definitions, loaders, hot reload, and hook executors for command, HTTP, prompt, and agent validators.	Has executor-level hook support and harness callback interception before and after tool execution, now surfaced as persisted `hook_event` stream entries and rendered in the web inspector. Main references: `brain/kernel/hooks.py`, `brain/kernel/tooling/executor.py`, `brain/agent/callbacks.py`, `brain/serve/chat_runtime.py`.	SecondBrain now has an operator-visible hook trace instead of callback-only internals.
Sandbox and runtime isolation	Supports a long-lived containerized sandbox session, path validation, and command wrapping.	Supports sandbox routing, bounded subprocess execution, git worktrees, and prepared task workspaces. Main references: `brain/sandbox/router.py`, `brain/workflows/executor.py`, `brain/agent/worktree.py`, `brain/background_sessions/runtime.py`.	SecondBrain still distinguishes workspace isolation from hard sandboxing, but sandbox selection and fallback are now explicit operator-visible runtime signals instead of hidden implementation detail.
Background task lifecycle	Has an explicit background task manager for shell and agent tasks, output logs, restartability, stop, list, and output-tail operations.	Has durable background sessions with status, heartbeat, checkpoints, resume, cancel, approval pause, and SSE replay. Main references: `brain/background_sessions/models.py`, `brain/background_sessions/runtime.py`, `brain/background_sessions/store.py`, `brain/serve/routers/sessions.py`.	SecondBrain is stronger on persisted background session state and checkpoint streaming. The reference pattern is stronger on task-manager ergonomics for lightweight spawned workers.
Subagent and swarm coordination	Has a more explicit swarm model with teams, mailboxes, worktree isolation, and background-agent task registration.	Has bounded subagent spawning, lane-based concurrency, team-style orchestration, and A2A message plumbing. Main references: `brain/agent/subagent.py`, `brain/agent/subagent_store.py`, `brain/agent/lanes.py`, `brain/agent/tools.py`, `brain/a2a/*`.	SecondBrain now persists delegated-work snapshots across process boundaries, streams subagent lifecycle events, and exposes those records in serve session detail payloads and the web chat inspector.
Task-manager surfaces	Exposes task create, list, inspect, output, stop, and send-message flows through tools and UI.	Exposes task planning, execution, approval, and workflow state through CLI, slash commands, stores, and web pages. Main references: `brain/cli/tasks.py`, `brain/chat/commands/tasks.py`, `brain/tasks/*`, `serve-ui/src/pages/ActionsPage.tsx`, `serve-ui/src/pages/OperationsPage.tsx`.	SecondBrain now includes a unified operator-facing background-session surface inside the existing operations UI, with launch, inspect, resume, cancel, checkpoint review, and branch-from-resume controls tied to durable session state.
Channel and gateway routing	Uses a channel bus and thread/session-key routing into long-lived channel runtimes.	Uses gateway envelopes, routing, lanes, and channel adapters tied to shared session state. Main references: `brain/gateway/`, `brain/channels/`.	SecondBrain already has strong parity here and a more explicit governed runtime posture.
Stream event contract	Uses typed stream events for text deltas, tool execution, compact progress, and UI protocol updates.	Uses an implementation-aligned SSE contract with canonical event types and a matching reducer in the web UI. Main references: `brain/serve/chat_runtime.py`, `docs/specs/agent-harness-frontend-contract.md`, `serve-ui/src/lib/chat.ts`.	SecondBrain now carries transport, policy, hook, sandbox, and subagent diagnostics inside the same tested canonical event contract.

Where SecondBrain Is Already Strong¶

durable approvals and gateway governance are more explicit and audit-friendly
background sessions already persist richer state than a simple task log
the serve/UI stream contract is documented and tested as a first-class interface
worktree and workspace preparation are more integrated with repo-native tasks
retrieval, context packs, and memory token budgeting are stronger than a simple prompt-assembly layer

Closed Gaps In This Implementation¶

1. Long-Session Continuity¶

SecondBrain now persists explicit continuity checkpoints instead of treating compaction as only a local turn-time budget reaction.

Delivered changes:

persisted session_checkpoint continuity records with summary, preserved_messages, tokens_saved, and resume_cursor
added user-visible resume points in session detail payloads
added branch-from-checkpoint/event APIs for both chat and background sessions
seeded branch sessions with synthetic continuity context instead of silently replaying raw history

Relevant code:

brain/agent/harness.py
brain/chat/session_store.py
brain/serve/chat_runtime.py
brain/serve/routers/chat.py
brain/serve/routers/sessions.py

2. Delegated Work Control Plane¶

SecondBrain now persists and surfaces delegated work as a durable operator object instead of only as in-process subagent state.

Delivered changes:

added runtime-backed subagent_runs persistence
persisted subagent lifecycle snapshots across spawn, completion, failure, and cancel
streamed subagent_event into the same session event contract as chat/background work
exposed subagent state in session detail payloads and the web chat inspector

Relevant code:

brain/agent/subagent.py
brain/agent/subagent_store.py
brain/agent/tools.py
brain/serve/payloads.py
serve-ui/src/pages/ChatPage.tsx

3. Unified Policy Surface¶

SecondBrain now exposes one canonical operator-facing policy event family even though underlying permission and approval primitives still live in separate modules.

Delivered changes:

standardized policy_decision payloads across live chat and background flows
emitted policy configuration, approval-required, and approval-applied decisions through the canonical stream contract
persisted the same event family to structured session events for replay and inspection

Relevant code:

brain/agent_runtime/policy.py
brain/serve/chat_runtime.py
brain/background_sessions/runtime.py

4. Hook Observability¶

SecondBrain now emits hook lifecycle traces into the runtime stream and session event ledger.

Delivered changes:

emitted structured hook_event rows for pre- and post-tool phases
persisted hook traces to session_events
rendered hook diagnostics in the web chat inspector

Relevant code:

brain/kernel/tooling/executor.py
brain/agent/callbacks.py
brain/serve/chat_runtime.py
serve-ui/src/lib/chat.ts

5. Stronger Isolation Defaults¶

SecondBrain now makes isolation and fallback behavior explicit to the operator instead of leaving it implicit in runtime selection logic.

Delivered changes:

emitted sandbox_notice for live chat and background sessions
surfaced when a run used a prepared workspace versus best-effort local execution
carried sandbox posture into session_state and session detail payloads

Relevant code:

brain/background_sessions/runtime.py
brain/serve/chat_runtime.py
serve-ui/src/pages/ChatPage.tsx

6. Transport Diagnostics In The Main Stream¶

SecondBrain now treats degraded transport/provider behavior as a first-class runtime event family.

Delivered changes:

emitted transport_diagnostic through STREAM_EVENT_TYPES
persisted transport diagnostics beside tool, approval, and completion events
rendered transport diagnostics in the web timeline and diagnostic inspector

Relevant code:

brain/chat/transport.py
brain/serve/chat_runtime.py
serve-ui/src/lib/chat.ts

7. Operator Surface Consolidation¶

SecondBrain now closes the remaining ergonomic gap by exposing durable background-session control directly in the existing operations UI.

Delivered changes:

added a Background Sessions mode to the operations page
surfaced launch, inspect, resume, cancel, and branch-from-resume flows in one place
rendered checkpoints, resume points, and delegated-work snapshots alongside the session runtime record
reused the existing /sessions APIs instead of introducing a parallel control plane

Relevant code:

brain/serve/routers/sessions.py
serve-ui/src/lib/api.ts
serve-ui/src/pages/OperationsPage.tsx
serve-ui/src/pages/OperationsPage.test.tsx

Delivery Order¶

Implemented in this change set:

delegated work persistence and lifecycle streaming
transport, policy, hook, and sandbox event unification
long-session continuity checkpoints and branchable resume
serve session detail payload expansion for resume points and subagents
web inspector support for the new runtime diagnostics
unified background-session operator controls in the operations UI
targeted stream, background-session, subagent, and reducer test coverage

Change Since This Comparison¶

The implementation work described above has now landed. SecondBrain now includes:

richer subagent snapshots plus runtime persistence in brain/agent/subagent.py and brain/agent/subagent_store.py
list_subagents and cancel_subagent in brain/agent/tools.py
continuity-aware compaction checkpoints and branchable resume in brain/chat/session_store.py
canonical policy_decision, hook_event, sandbox_notice, transport_diagnostic, and subagent_event stream families in brain/serve/chat_runtime.py
expanded serve session detail payloads for resume_points and subagents in brain/serve/payloads.py
frontend reducer and inspector support for those runtime diagnostics in serve-ui/src/lib/chat.ts and serve-ui/src/pages/ChatPage.tsx
a unified Background Sessions operator surface in serve-ui/src/pages/OperationsPage.tsx
targeted verification in tests/infra/test_serve_stream_contract.py, tests/infra/test_background_sessions.py, tests/infra/test_serve_api.py, tests/agent/test_subagent_registry.py, and serve-ui/src/lib/chat.test.ts

The original gaps in this comparison are now closed as shipped implementation work rather than remaining as planning-only notes.