Agent Harness Frontend Contract¶
Added 2026-04-21. This document is the implementation-aligned source of truth for how agent runtime state reaches frontend clients.
Purpose¶
- define the canonical event contract from harness to frontend
- separate stable operator-facing signals from debug/detail payloads
- inventory current live-chat and background-session exposure
- identify bridge-engine fidelity limits
Runtime Boundary¶
flowchart LR
AH["AgentHarness / AgenticRuntime"] --> CB["Callbacks"]
CB --> SSE["Serve SSE envelopes"]
CB --> DB["Session events + checkpoints"]
SSE --> UI["serve-ui reducer + timeline"]
DB --> UI
BG["BackgroundSession runtime"] --> DB
BG --> SSE
Primary paths:
- live chat:
brain/agent/harness.py->brain/serve/chat_runtime.py->serve-ui/src/lib/chat.ts - background/agentic:
brain/background_sessions/runtime.py-> session events/checkpoints ->/sessions/{id}/events/stream - health snapshots: background session row + session events + tool observations
- checkpoints/artifacts/memory proposals ->
/sessions/{id}/health/stream-> cockpit/agent-run UI
Exposure Tiers¶
Default operator UI:
- response text
- thinking/reflection summaries
- tool timeline and outcomes
- approvals
- agent health score, risk flags, budget pressure, memory provenance, audit rows, and recovery hint
- citations and context-pack summary
- Antahkarana / guardrail summary
- background checkpoints and step progress
Debug UI:
- raw event payloads
- raw tool args/results/outcomes
- context warning / compaction payloads
- tool-selection and validation payloads
The frontend must not invent hidden chain-of-thought beyond the existing
thinking stream content.
Canonical Event Inventory¶
| Event | Source | Persisted | Frontend role | Tier |
|---|---|---|---|---|
session |
serve chat bootstrap | no | assign session id | default |
session_state |
serve chat + background runtime | yes | runtime/status card | default |
session_checkpoint |
background runtime | yes | checkpoint panel | default |
policy_decision |
serve chat + background runtime | yes | governance timeline and diagnostics | debug |
hook_event |
harness callbacks | yes | hook lifecycle trace | debug |
sandbox_notice |
serve chat + background runtime | yes | execution-isolation notice | debug |
transport_diagnostic |
serve chat finalizer | yes | provider fallback / degraded-runtime diagnostics | debug |
subagent_event |
subagent registry + tool callbacks | yes | delegated-work lifecycle | debug |
turn_start, turn_start_tools |
harness callbacks | yes | turn framing | debug |
token, response |
provider streaming + harness | yes | assistant text | default |
thinking, reflection |
harness/reflection engine | yes | reasoning surface | default |
tool_call |
harness callback | yes | live tool row start | default |
tool_result |
post-hook success path | yes | rendered tool result | default |
tool_outcome |
typed ToolOutcome path |
yes | terminal tool status | default |
tool_batch |
harness scheduler | yes | wave diagnostics | debug |
tool_selected |
semantic tool selection | yes | selection diagnostics | debug |
tool_validation_error |
arg validation / repair loop | yes | diagnostics | debug |
context_warning |
context warning tracker | yes | warning banner/timeline | default |
context_compacted |
compactor | yes | compaction marker | default |
step_progress |
native agentic background runtime | yes | background step timeline | default |
usage |
turn finalizer | yes | token metrics | debug |
retrieval |
serve chat completion | yes | citations/context panel | default |
antahkarana |
serve chat cognition summary | yes | cognition/guardrail panel | default |
approval_required, approval_applied |
serve chat + background runtime | yes | approval UX | default |
turn_end, completed, error |
serve chat + background runtime | yes | turn final state | default |
Session State Contract¶
Common fields:
session_idstatusprovidermodelagent_profilepermission_modeapproval_modethink_levelpolicy_pathsandbox_modesandbox_isolation
Optional background fields:
enginerunner_kindapproval_request_idworkspace_idtask_graph_idprocess_idattempt_countmax_retriesexpires_atparent_session_idbranch_source_event_idbranch_resume_cursorlast_heartbeat_atlast_errorupdated_atended_atresume_context
session_state remains additive. New fields must not break existing clients.
Tool Lifecycle Contract¶
Tool lifecycle must preserve tool_call_id on every tool-related event:
tool_calltool_resultwhen a post-hook result existstool_outcomefor the typed terminal outcome
tool_outcome.status values currently used:
oktimeouterrordeniedartifact
tool_result is not a substitute for tool_outcome; it is the rendered
result surface for successful post-hook tool payloads.
Payload Examples¶
Live chat local_search¶
{
"type": "tool_call",
"payload": {
"name": "local_search",
"arguments": { "query": "roadmap", "_tool_call_id": "call-1" },
"tool_call_id": "call-1"
}
}
{
"type": "tool_outcome",
"payload": {
"tool_call_id": "call-1",
"tool_name": "local_search",
"outcome_type": "ToolExecutionResult",
"status": "ok",
"elapsed_ms": 84,
"result": { "chunks": [] }
}
}
Approval pause / resume¶
{
"type": "approval_required",
"payload": {
"request_id": "apr_123",
"tool_name": "write_file",
"reason": "Web chat turn requested a write-capable tool."
}
}
Context compaction¶
{
"type": "context_compacted",
"payload": {
"original_count": 24,
"compacted_count": 10,
"tokens_saved": 4200,
"summary": "Kept the recent working set and collapsed older tool chatter.",
"preserved_messages": 6,
"resume_cursor": "compact:serve_demo:14"
}
}
Background checkpoint¶
{
"type": "session_checkpoint",
"payload": {
"session_id": "bg_123",
"sequence": 4,
"checkpoint_type": "step",
"status": "completed",
"summary_text": "step-1 completed: Inspect files",
"resume_cursor": null
}
}
Policy / sandbox / transport / subagent diagnostics¶
{
"type": "policy_decision",
"payload": {
"subject": "turn_runtime",
"outcome": "configured",
"reason": "Resolved runtime execution mode for this chat turn.",
"source": "serve_chat_runtime",
"policy_path": "execution_mode -> approval_policy -> runtime_callbacks",
"permission_mode": "normal",
"approval_mode": "on_request"
}
}
{
"type": "sandbox_notice",
"payload": {
"requested_backend": "background_session",
"resolved_backend": "task_workspace",
"isolation": "workspace",
"reason": "Prepared a task workspace for write-capable background work."
}
}
{
"type": "transport_diagnostic",
"payload": {
"status": "degraded",
"kind": "provider_fallback",
"provider_requested": "openai",
"provider_actual": "anthropic",
"reason": "Primary provider failed health checks.",
"failure_count": 1
}
}
{
"type": "subagent_event",
"payload": {
"agent_id": "subagent_1",
"status": "completed",
"parent_session_id": "serve_chat_123"
}
}
Session Detail And Resume Contract¶
GET /chat/sessions/{session_id} now exposes additional operator-facing data:
resume_points: ordered resume/branch candidates derived from checkpoints, approvals, and completed turnssubagents: persisted delegated-work snapshots tied to the sessionbackground_artifacts: typed artifacts claimed by a background session, such as task graphs, isolated workspaces, agentic run records, and final outputschannel: a deterministic projection of the session event log into stable participants and stamped envelopes. The raw event log remains authoritative; the projection is for bounded timeline rendering and operator audit.
GET /sessions/{session_id} exposes the durable background row plus:
checkpoints: latest persisted background checkpointsartifacts: typed background artifacts, ordered newest firsthealth: current operator-facing health snapshotchannel: the same participant/envelope projection exposed on chat session detail, rooted at the background session id
GET /sessions/{session_id}/health/stream emits health_snapshot SSE
envelopes whenever the derived snapshot changes. Start, resume, pause, cancel,
list, and detail responses also include health data so clients do not need to
wait for a separate refetch before rendering the current control-plane state.
Health snapshots include policy.health and policy.enforcement. The policy is
stored locally in background-session metadata and currently supports these
operator budgets: minimum score, runtime minutes, total tokens, cost USD,
context usage percentage, failed tool calls, loop warnings, repeated plan
events, tool latency, and events without evidence. When a running/recovering
session crosses a configured threshold, the runtime/supervisor either creates a
normal local approval request and moves the session to awaiting_approval, or
pauses the session directly, depending on policy.health.action.
Resume flows:
POST /chat/sessions/{session_id}/branchbranches a new chat session fromfrom_event_idorresume_cursorPOST /sessions/{session_id}/resumecan either resume the same background session or branch a new one whenfrom_event_idorresume_cursoris suppliedPOST /sessions/{session_id}/pausepauses a background session without marking it terminal- branch sessions are seeded with a
resume_seedcheckpoint so the frontend can render continuity context without replaying the full source history - branched background sessions carry
parent_session_id,branch_source_event_id, andbranch_resume_cursoron the durable row
Session channel projection is exposed on GET /chat/sessions/{session_id} and
GET /sessions/{session_id}:
channel.channel_idissession:{session_id}.participantsincludes the runtime, the session agent, and any observed local operator or tool participants.envelopesmirrors recentsession_eventsrows withsender_id,recipient_ids,event_type,summary, and the original payload.- Approval events route through the local operator participant instead of using an implicit side path.
Background Session Kernel Contract¶
Background sessions are the durable agent kernel for long-running work. The runtime persists enough state to recover or branch without depending on process memory:
- attempts: every run increments
attempt_count; transient failures move torecoveringwhileattempt_count <= max_retries - expiry: sessions with a past
expires_atfail with anexpiredcheckpoint before execution or supervisor recovery - pause/resume: paused sessions keep their row, checkpoints, and artifacts; resume requeues the same session unless a branch cursor is supplied
- work isolation: write-capable sessions prepare a task graph and task workspace, then persist both as typed artifacts
- approvals: approval waits remain
awaiting_approvalwith a durableapproval_request_id - artifacts: final outputs, task graphs, workspaces, and native agentic run ids
are persisted in
background_agent_artifacts
Gap Matrix¶
| Gap class | Current state |
|---|---|
| already exposed | thinking, tool_call, tool_result, retrieval, antahkarana, approvals, checkpoints |
| now standardized in the main stream | tool_outcome, step_progress, policy_decision, hook_event, sandbox_notice, transport_diagnostic, subagent_event |
| persisted and rendered in diagnostics | tool_batch, tool_selected, tool_validation_error, context_warning, context_compacted |
| session detail continuity surfaces | resume_points, resume_seed checkpoints, persisted subagents, typed background artifacts |
| background durable kernel | retries, expiry, pause/resume, branch-from-state, approval waits, workspace/task graph artifacts |
| agent health control plane | health score, stuck/waiting/degraded/budget flags, memory provenance, audit trail, recovery hints, health SSE, and enforced local health policy |
| background-only fidelity limits | bridge engines expose state, checkpoints, approvals, final output, and typed artifacts but not native per-tool callbacks |
Bridge Engine Limits¶
claude and codex background sessions are best-effort bridge surfaces.
They can reliably expose:
session_statesession_checkpointpolicy_decisionsandbox_noticeapproval_required/approval_appliedcompleted/error
They do not claim native harness parity for:
- per-tool callbacks
- typed
tool_outcome - context warning / compaction internals
- native
step_progress
Rollout Order¶
- keep
STREAM_EVENT_TYPESandHANDLED_EVENT_TYPESin lock-step - emit canonical policy, hook, sandbox, transport, and subagent diagnostics from
QueueCallbacks - persist continuity checkpoints and branchable resume metadata in session storage
- standardize background
session_stateand persiststep_progress - render unified UI timeline and diagnostic panels in
serve-ui - keep docs/tests aligned with the shipped contract
Acceptance Criteria¶
/stream-eventsmatches the frontend handled-event set- duplicate tool names resolve correctly by
tool_call_id - chat turns emit
tool_outcomefor success, timeout, denial, artifact, and failure paths - context warnings and compactions are both persisted and visible in the UI timeline
- policy, hook, sandbox, transport, and subagent diagnostics are persisted and reducer-handled
- session detail exposes resume points and persisted delegated-work state
- chat and background sessions can branch from an event id or resume cursor
- native background agentic runs emit
session_state,session_checkpoint, andstep_progress - bridge sessions document reduced fidelity instead of pretending full parity