ReflectionEngine¶

File: brain/agent/reflection_engine.py

Purpose¶

ReflectionEngine executes an optional mid-turn reflection step. During a long tool-calling sequence, reflection asks the LLM to evaluate its own progress and decide whether to continue or adjust its approach.

The engine is stateless: all turn-scoped objects are passed in per call. One instance can be reused across turns (and sessions).

Construction¶

engine = ReflectionEngine(
    provider=provider,             # LLMProvider — same provider used for main generation
    thinking_manager=thinking,     # ThinkingManager — scheduling and prompt building
    callbacks=callbacks,           # AgentCallbacks — on_reflection() callback
)

`should_reflect()` — Scheduling¶

if self._reflection_engine.should_reflect():
    self._reflection_engine.reflect(...)

Delegates directly to ThinkingManager.should_reflect():

Condition	Should reflect?
`think_level == ThinkLevel.OFF`	No
A tool error occurred this step (`_tool_errors` non-empty)	Yes (if `reflect_on_tool_error=True`)
Periodic interval (`reflect_every > 0` and `turn_count % reflect_every == 0`)	Yes
Otherwise	No

`reflect()` — LLM Call¶

result: ReflectionResult = engine.reflect(
    messages=messages,          # Current conversation message list (read-only)
    tool_results=state.tool_results,  # Lightweight tool call feed
    budget=budget,              # TurnBudget — for claiming a slot and checking expiry
)

Budget Gate¶

if not budget.claim_reflection():
    return ReflectionResult(reflection="[budget exhausted]", should_continue=True)

if budget.is_expired():
    return ReflectionResult(reflection="[deadline expired]", should_continue=True)

A reflection slot is claimed atomically before the LLM call. If the budget is exhausted (e.g. 4 reflections already ran) or the deadline has passed, a stub result is returned immediately.

LLM Call¶

reflection_prompt = thinking_manager.get_reflection_prompt(messages, tool_results)

timeout_s = min(30, max(5, int(budget.remaining_s())))

result = provider.generate(
    messages=[
        {"role": "system", "content": _REFLECTION_SYSTEM},
        {"role": "user", "content": reflection_prompt},
    ],
    tools=[],           # No tools during reflection
    timeout_s=timeout_s,
)

The LLM call is: - Wrapped in an OTEL span (secondbrain.agent.reflect) - Capped at 30 seconds, or budget.remaining_s() if less (minimum 5s) - Made with no tools — reflection is text-only

Outcome¶

On success:

thinking_manager.record_reflection(result)  # clears tool_errors
callbacks.on_reflection(result)             # UI notification
return ReflectionResult(reflection=text, should_continue=True)

On any exception:

return ReflectionResult(reflection="[reflection failed]", should_continue=True)

should_continue is always True — reflection never terminates the turn loop.

`ReflectionResult`¶

@dataclass
class ReflectionResult:
    reflection: str            # LLM's self-assessment text
    should_continue: bool = True
    suggested_action: str | None = None
    confidence: float = 0.7

Reflection System Prompt¶

_REFLECTION_SYSTEM = (
    "You are a reflective assistant evaluating your own progress. "
    "Be concise and honest about what is working and what needs improvement."
)

The user-facing reflection prompt (from ThinkingManager.get_reflection_prompt()) asks the LLM to consider: 1. What has been accomplished so far? 2. Are we on track to answer the user's question? 3. What information might still be missing? 4. Should we adjust the approach?

And lists the last 3 tool results as context.

Effect on the Turn¶

The reflection result is not injected into the main message list. It is: - Passed to the on_reflection callback (for UI display). - Stored in ThinkingManager._last_reflection for observability. - Used to clear _tool_errors (so the same error doesn't trigger repeated reflections).

The turn loop always continues regardless of the reflection content.