ReflectionEngine¶
File: brain/agent/reflection_engine.py
Purpose¶
ReflectionEngine executes an optional mid-turn reflection step. During a
long tool-calling sequence, reflection asks the LLM to evaluate its own
progress and decide whether to continue or adjust its approach.
The engine is stateless: all turn-scoped objects are passed in per call. One instance can be reused across turns (and sessions).
Construction¶
engine = ReflectionEngine(
provider=provider, # LLMProvider — same provider used for main generation
thinking_manager=thinking, # ThinkingManager — scheduling and prompt building
callbacks=callbacks, # AgentCallbacks — on_reflection() callback
)
should_reflect() — Scheduling¶
Delegates directly to ThinkingManager.should_reflect():
| Condition | Should reflect? |
|---|---|
think_level == ThinkLevel.OFF |
No |
A tool error occurred this step (_tool_errors non-empty) |
Yes (if reflect_on_tool_error=True) |
Periodic interval (reflect_every > 0 and turn_count % reflect_every == 0) |
Yes |
| Otherwise | No |
reflect() — LLM Call¶
result: ReflectionResult = engine.reflect(
messages=messages, # Current conversation message list (read-only)
tool_results=state.tool_results, # Lightweight tool call feed
budget=budget, # TurnBudget — for claiming a slot and checking expiry
)
Budget Gate¶
if not budget.claim_reflection():
return ReflectionResult(reflection="[budget exhausted]", should_continue=True)
if budget.is_expired():
return ReflectionResult(reflection="[deadline expired]", should_continue=True)
A reflection slot is claimed atomically before the LLM call. If the budget is exhausted (e.g. 4 reflections already ran) or the deadline has passed, a stub result is returned immediately.
LLM Call¶
reflection_prompt = thinking_manager.get_reflection_prompt(messages, tool_results)
timeout_s = min(30, max(5, int(budget.remaining_s())))
result = provider.generate(
messages=[
{"role": "system", "content": _REFLECTION_SYSTEM},
{"role": "user", "content": reflection_prompt},
],
tools=[], # No tools during reflection
timeout_s=timeout_s,
)
The LLM call is:
- Wrapped in an OTEL span (secondbrain.agent.reflect)
- Capped at 30 seconds, or budget.remaining_s() if less (minimum 5s)
- Made with no tools — reflection is text-only
Outcome¶
On success:
thinking_manager.record_reflection(result) # clears tool_errors
callbacks.on_reflection(result) # UI notification
return ReflectionResult(reflection=text, should_continue=True)
On any exception:
should_continue is always True — reflection never terminates the turn loop.
ReflectionResult¶
@dataclass
class ReflectionResult:
reflection: str # LLM's self-assessment text
should_continue: bool = True
suggested_action: str | None = None
confidence: float = 0.7
Reflection System Prompt¶
_REFLECTION_SYSTEM = (
"You are a reflective assistant evaluating your own progress. "
"Be concise and honest about what is working and what needs improvement."
)
The user-facing reflection prompt (from ThinkingManager.get_reflection_prompt())
asks the LLM to consider:
1. What has been accomplished so far?
2. Are we on track to answer the user's question?
3. What information might still be missing?
4. Should we adjust the approach?
And lists the last 3 tool results as context.
Effect on the Turn¶
The reflection result is not injected into the main message list. It is:
- Passed to the on_reflection callback (for UI display).
- Stored in ThinkingManager._last_reflection for observability.
- Used to clear _tool_errors (so the same error doesn't trigger repeated reflections).
The turn loop always continues regardless of the reflection content.