Skip to content

Agent Harness Comparison And Improvement Plan

Added 2026-04-22. This is an implementation-aligned comparison between SecondBrain and a reference harness pattern assembled from direct source inspection. It is intended as an engineering reference, not a product claim sheet.

Purpose

  • compare a reference harness pattern against the current SecondBrain implementation
  • identify where SecondBrain already has parity or stronger local primitives
  • identify concrete gaps worth improving inside this repo
  • avoid vague "agent platform" language and stay tied to code paths

Scope

The comparison covers:

  • session resume and history
  • context compaction
  • memory discovery and injection
  • permissions and approvals
  • pre/post tool hooks
  • sandbox and runtime isolation
  • background task lifecycle
  • subagent and swarm coordination
  • task-manager surfaces
  • channel and gateway routing
  • stream event contracts

Side-By-Side Matrix

Concept Reference harness pattern SecondBrain Current assessment
Session resume and history Persists per-project session snapshots plus latest-session state, including carry-forward metadata for long sessions and resumed operators. Persists chat sessions, messages, and structured session events in SQLite, with selective resume, resume cursors, and branch-from-checkpoint flows in CLI and serve surfaces. Main references: brain/chat/session_store.py, brain/cli/chat.py, brain/serve/routers/chat.py, brain/serve/routers/sessions.py. SecondBrain now has durable replay plus selective resume from checkpoints, approvals, or arbitrary session events, including branch creation for both chat and background work.
Context compaction Uses a richer long-session compaction pipeline: microcompact, continuity-style carry-forward compaction, full LLM compaction, and UI-visible compaction phases. Has budget-aware warning and compaction logic in the harness with streamed warning and compaction events, persisted continuity checkpoints, resume cursors, and UI-visible preserved-state summaries. Main references: brain/context/compaction.py, brain/agent/harness.py, brain/chat/session_store.py, brain/serve/chat_runtime.py. SecondBrain now persists continuity checkpoints with enough state to inspect and branch from compaction boundaries.
Memory discovery and injection Pulls repo-local instruction files, memory files, local rules, and session-derived personalization into the prompt layer. Injects memory retrieval into the harness with semantic and FTS retrieval, token budgeting, and retrieval summaries. Main references: brain/memory/retriever.py, brain/agent/harness.py, brain/serve/payloads.py. SecondBrain is strong on retrieval quality and memory budgeting, but weaker on repo-local instruction discovery and explicit write-back from session behavior into durable operator rules.
Permissions and approvals Uses path rules, deny patterns, permission modes, and interactive approval UIs. Uses kernel tool permissions, runtime approval policies, gateway approval persistence, and streamed approval events. Main references: brain/kernel/tooling/permissions.py, brain/agent_runtime/policy.py, brain/serve/chat_runtime.py, brain/gateway/gateway.py. SecondBrain now exposes a canonical policy_decision event family across live chat and background sessions, with approval-required and approval-applied decisions tied back to the same stream contract.
Pre/post tool hooks Has explicit lifecycle hook definitions, loaders, hot reload, and hook executors for command, HTTP, prompt, and agent validators. Has executor-level hook support and harness callback interception before and after tool execution, now surfaced as persisted hook_event stream entries and rendered in the web inspector. Main references: brain/kernel/hooks.py, brain/kernel/tooling/executor.py, brain/agent/callbacks.py, brain/serve/chat_runtime.py. SecondBrain now has an operator-visible hook trace instead of callback-only internals.
Sandbox and runtime isolation Supports a long-lived containerized sandbox session, path validation, and command wrapping. Supports sandbox routing, bounded subprocess execution, git worktrees, and prepared task workspaces. Main references: brain/sandbox/router.py, brain/workflows/executor.py, brain/agent/worktree.py, brain/background_sessions/runtime.py. SecondBrain still distinguishes workspace isolation from hard sandboxing, but sandbox selection and fallback are now explicit operator-visible runtime signals instead of hidden implementation detail.
Background task lifecycle Has an explicit background task manager for shell and agent tasks, output logs, restartability, stop, list, and output-tail operations. Has durable background sessions with status, heartbeat, checkpoints, resume, cancel, approval pause, and SSE replay. Main references: brain/background_sessions/models.py, brain/background_sessions/runtime.py, brain/background_sessions/store.py, brain/serve/routers/sessions.py. SecondBrain is stronger on persisted background session state and checkpoint streaming. The reference pattern is stronger on task-manager ergonomics for lightweight spawned workers.
Subagent and swarm coordination Has a more explicit swarm model with teams, mailboxes, worktree isolation, and background-agent task registration. Has bounded subagent spawning, lane-based concurrency, team-style orchestration, and A2A message plumbing. Main references: brain/agent/subagent.py, brain/agent/subagent_store.py, brain/agent/lanes.py, brain/agent/tools.py, brain/a2a/*. SecondBrain now persists delegated-work snapshots across process boundaries, streams subagent lifecycle events, and exposes those records in serve session detail payloads and the web chat inspector.
Task-manager surfaces Exposes task create, list, inspect, output, stop, and send-message flows through tools and UI. Exposes task planning, execution, approval, and workflow state through CLI, slash commands, stores, and web pages. Main references: brain/cli/tasks.py, brain/chat/commands/tasks.py, brain/tasks/*, serve-ui/src/pages/ActionsPage.tsx, serve-ui/src/pages/OperationsPage.tsx. SecondBrain now includes a unified operator-facing background-session surface inside the existing operations UI, with launch, inspect, resume, cancel, checkpoint review, and branch-from-resume controls tied to durable session state.
Channel and gateway routing Uses a channel bus and thread/session-key routing into long-lived channel runtimes. Uses gateway envelopes, routing, lanes, and channel adapters tied to shared session state. Main references: brain/gateway/*, brain/channels/*. SecondBrain already has strong parity here and a more explicit governed runtime posture.
Stream event contract Uses typed stream events for text deltas, tool execution, compact progress, and UI protocol updates. Uses an implementation-aligned SSE contract with canonical event types and a matching reducer in the web UI. Main references: brain/serve/chat_runtime.py, docs/specs/agent-harness-frontend-contract.md, serve-ui/src/lib/chat.ts. SecondBrain now carries transport, policy, hook, sandbox, and subagent diagnostics inside the same tested canonical event contract.

Where SecondBrain Is Already Strong

  • durable approvals and gateway governance are more explicit and audit-friendly
  • background sessions already persist richer state than a simple task log
  • the serve/UI stream contract is documented and tested as a first-class interface
  • worktree and workspace preparation are more integrated with repo-native tasks
  • retrieval, context packs, and memory token budgeting are stronger than a simple prompt-assembly layer

Closed Gaps In This Implementation

1. Long-Session Continuity

SecondBrain now persists explicit continuity checkpoints instead of treating compaction as only a local turn-time budget reaction.

Delivered changes:

  • persisted session_checkpoint continuity records with summary, preserved_messages, tokens_saved, and resume_cursor
  • added user-visible resume points in session detail payloads
  • added branch-from-checkpoint/event APIs for both chat and background sessions
  • seeded branch sessions with synthetic continuity context instead of silently replaying raw history

Relevant code:

  • brain/agent/harness.py
  • brain/chat/session_store.py
  • brain/serve/chat_runtime.py
  • brain/serve/routers/chat.py
  • brain/serve/routers/sessions.py

2. Delegated Work Control Plane

SecondBrain now persists and surfaces delegated work as a durable operator object instead of only as in-process subagent state.

Delivered changes:

  • added runtime-backed subagent_runs persistence
  • persisted subagent lifecycle snapshots across spawn, completion, failure, and cancel
  • streamed subagent_event into the same session event contract as chat/background work
  • exposed subagent state in session detail payloads and the web chat inspector

Relevant code:

  • brain/agent/subagent.py
  • brain/agent/subagent_store.py
  • brain/agent/tools.py
  • brain/serve/payloads.py
  • serve-ui/src/pages/ChatPage.tsx

3. Unified Policy Surface

SecondBrain now exposes one canonical operator-facing policy event family even though underlying permission and approval primitives still live in separate modules.

Delivered changes:

  • standardized policy_decision payloads across live chat and background flows
  • emitted policy configuration, approval-required, and approval-applied decisions through the canonical stream contract
  • persisted the same event family to structured session events for replay and inspection

Relevant code:

  • brain/agent_runtime/policy.py
  • brain/serve/chat_runtime.py
  • brain/background_sessions/runtime.py

4. Hook Observability

SecondBrain now emits hook lifecycle traces into the runtime stream and session event ledger.

Delivered changes:

  • emitted structured hook_event rows for pre- and post-tool phases
  • persisted hook traces to session_events
  • rendered hook diagnostics in the web chat inspector

Relevant code:

  • brain/kernel/tooling/executor.py
  • brain/agent/callbacks.py
  • brain/serve/chat_runtime.py
  • serve-ui/src/lib/chat.ts

5. Stronger Isolation Defaults

SecondBrain now makes isolation and fallback behavior explicit to the operator instead of leaving it implicit in runtime selection logic.

Delivered changes:

  • emitted sandbox_notice for live chat and background sessions
  • surfaced when a run used a prepared workspace versus best-effort local execution
  • carried sandbox posture into session_state and session detail payloads

Relevant code:

  • brain/background_sessions/runtime.py
  • brain/serve/chat_runtime.py
  • serve-ui/src/pages/ChatPage.tsx

6. Transport Diagnostics In The Main Stream

SecondBrain now treats degraded transport/provider behavior as a first-class runtime event family.

Delivered changes:

  • emitted transport_diagnostic through STREAM_EVENT_TYPES
  • persisted transport diagnostics beside tool, approval, and completion events
  • rendered transport diagnostics in the web timeline and diagnostic inspector

Relevant code:

  • brain/chat/transport.py
  • brain/serve/chat_runtime.py
  • serve-ui/src/lib/chat.ts

7. Operator Surface Consolidation

SecondBrain now closes the remaining ergonomic gap by exposing durable background-session control directly in the existing operations UI.

Delivered changes:

  • added a Background Sessions mode to the operations page
  • surfaced launch, inspect, resume, cancel, and branch-from-resume flows in one place
  • rendered checkpoints, resume points, and delegated-work snapshots alongside the session runtime record
  • reused the existing /sessions APIs instead of introducing a parallel control plane

Relevant code:

  • brain/serve/routers/sessions.py
  • serve-ui/src/lib/api.ts
  • serve-ui/src/pages/OperationsPage.tsx
  • serve-ui/src/pages/OperationsPage.test.tsx

Delivery Order

Implemented in this change set:

  1. delegated work persistence and lifecycle streaming
  2. transport, policy, hook, and sandbox event unification
  3. long-session continuity checkpoints and branchable resume
  4. serve session detail payload expansion for resume points and subagents
  5. web inspector support for the new runtime diagnostics
  6. unified background-session operator controls in the operations UI
  7. targeted stream, background-session, subagent, and reducer test coverage

Change Since This Comparison

The implementation work described above has now landed. SecondBrain now includes:

  • richer subagent snapshots plus runtime persistence in brain/agent/subagent.py and brain/agent/subagent_store.py
  • list_subagents and cancel_subagent in brain/agent/tools.py
  • continuity-aware compaction checkpoints and branchable resume in brain/chat/session_store.py
  • canonical policy_decision, hook_event, sandbox_notice, transport_diagnostic, and subagent_event stream families in brain/serve/chat_runtime.py
  • expanded serve session detail payloads for resume_points and subagents in brain/serve/payloads.py
  • frontend reducer and inspector support for those runtime diagnostics in serve-ui/src/lib/chat.ts and serve-ui/src/pages/ChatPage.tsx
  • a unified Background Sessions operator surface in serve-ui/src/pages/OperationsPage.tsx
  • targeted verification in tests/infra/test_serve_stream_contract.py, tests/infra/test_background_sessions.py, tests/infra/test_serve_api.py, tests/agent/test_subagent_registry.py, and serve-ui/src/lib/chat.test.ts

The original gaps in this comparison are now closed as shipped implementation work rather than remaining as planning-only notes.