Skip to content

ReflectionEngine

File: brain/agent/reflection_engine.py


Purpose

ReflectionEngine executes an optional mid-turn reflection step. During a long tool-calling sequence, reflection asks the LLM to evaluate its own progress and decide whether to continue or adjust its approach.

The engine is stateless: all turn-scoped objects are passed in per call. One instance can be reused across turns (and sessions).


Construction

engine = ReflectionEngine(
    provider=provider,             # LLMProvider — same provider used for main generation
    thinking_manager=thinking,     # ThinkingManager — scheduling and prompt building
    callbacks=callbacks,           # AgentCallbacks — on_reflection() callback
)

should_reflect() — Scheduling

if self._reflection_engine.should_reflect():
    self._reflection_engine.reflect(...)

Delegates directly to ThinkingManager.should_reflect():

Condition Should reflect?
think_level == ThinkLevel.OFF No
A tool error occurred this step (_tool_errors non-empty) Yes (if reflect_on_tool_error=True)
Periodic interval (reflect_every > 0 and turn_count % reflect_every == 0) Yes
Otherwise No

reflect() — LLM Call

result: ReflectionResult = engine.reflect(
    messages=messages,          # Current conversation message list (read-only)
    tool_results=state.tool_results,  # Lightweight tool call feed
    budget=budget,              # TurnBudget — for claiming a slot and checking expiry
)

Budget Gate

if not budget.claim_reflection():
    return ReflectionResult(reflection="[budget exhausted]", should_continue=True)

if budget.is_expired():
    return ReflectionResult(reflection="[deadline expired]", should_continue=True)

A reflection slot is claimed atomically before the LLM call. If the budget is exhausted (e.g. 4 reflections already ran) or the deadline has passed, a stub result is returned immediately.

LLM Call

reflection_prompt = thinking_manager.get_reflection_prompt(messages, tool_results)

timeout_s = min(30, max(5, int(budget.remaining_s())))

result = provider.generate(
    messages=[
        {"role": "system", "content": _REFLECTION_SYSTEM},
        {"role": "user", "content": reflection_prompt},
    ],
    tools=[],           # No tools during reflection
    timeout_s=timeout_s,
)

The LLM call is: - Wrapped in an OTEL span (secondbrain.agent.reflect) - Capped at 30 seconds, or budget.remaining_s() if less (minimum 5s) - Made with no tools — reflection is text-only

Outcome

On success:

thinking_manager.record_reflection(result)  # clears tool_errors
callbacks.on_reflection(result)             # UI notification
return ReflectionResult(reflection=text, should_continue=True)

On any exception:

return ReflectionResult(reflection="[reflection failed]", should_continue=True)

should_continue is always True — reflection never terminates the turn loop.


ReflectionResult

@dataclass
class ReflectionResult:
    reflection: str            # LLM's self-assessment text
    should_continue: bool = True
    suggested_action: str | None = None
    confidence: float = 0.7

Reflection System Prompt

_REFLECTION_SYSTEM = (
    "You are a reflective assistant evaluating your own progress. "
    "Be concise and honest about what is working and what needs improvement."
)

The user-facing reflection prompt (from ThinkingManager.get_reflection_prompt()) asks the LLM to consider: 1. What has been accomplished so far? 2. Are we on track to answer the user's question? 3. What information might still be missing? 4. Should we adjust the approach?

And lists the last 3 tool results as context.


Effect on the Turn

The reflection result is not injected into the main message list. It is: - Passed to the on_reflection callback (for UI display). - Stored in ThinkingManager._last_reflection for observability. - Used to clear _tool_errors (so the same error doesn't trigger repeated reflections).

The turn loop always continues regardless of the reflection content.