Skip to content

Reflexion Agent

Reflexion (Shinn et al., 2023) generates an answer, critiques it, then refines. The agent stops early if the critique finds no issues.

Usage

from brain.patterns import ReflexionAgent

agent = ReflexionAgent(max_iterations=3, early_stop=True)
result = agent.run("Explain the trade-offs between RAG and fine-tuning")
print(result.answer)
print(f"Refined {result.iterations} time(s)")

API Reference

brain.patterns.reflexion.ReflexionAgent

ReflexionAgent(provider: LLMProvider | None = None, max_iterations: int = 3, critique_provider: LLMProvider | None = None, early_stop: bool = True)

Bases: BasePattern

Reflexion agent: generate → self-critique → reflect → retry.

Parameters:

Name Type Description Default
provider LLMProvider | None

LLMProvider instance; defaults to LocalEchoProvider.

None
max_iterations int

maximum generate→critique→reflect cycles.

3
critique_provider LLMProvider | None

separate provider for the critique step (defaults to provider).

None
early_stop bool

stop early if critique returns PASS (default: True).

True
Source code in brain/patterns/reflexion.py
def __init__(
    self,
    provider: LLMProvider | None = None,
    max_iterations: int = 3,
    critique_provider: LLMProvider | None = None,
    early_stop: bool = True,
) -> None:
    self.max_iterations = max_iterations
    self.early_stop = early_stop
    self._provider = provider
    self._critique_provider = critique_provider

run

run(task: str, **kwargs: Any) -> PatternResult
Source code in brain/patterns/reflexion.py
def run(self, task: str, **kwargs: Any) -> PatternResult:
    provider = self._get_provider()
    crit_provider = self._critique_provider or provider
    steps: list[Step] = []

    # Initial generation
    try:
        current_answer = self._generate(task, provider)
    except Exception as exc:  # noqa: BLE001  # silent-ok: fail-soft, surface via metric/log elsewhere
        return PatternResult(
            answer="", ok=False, error=f"Generation failed: {exc}", iterations=0
        )

    for i in range(self.max_iterations):
        # Critique
        try:
            critique_text = self._critique(task, current_answer, crit_provider)
        except Exception as exc:  # noqa: BLE001
            logger.warning("ReflexionAgent: critique failed at iter %d: %s", i, exc)
            critique_text = "VERDICT: PASS\nISSUES: none\nSUGGESTION: none"

        passed, issues, suggestion = _parse_critique(critique_text)

        step = Step(
            index=i,
            thought=f"issues={issues} | suggestion={suggestion}",
            action="critique",
            observation=critique_text,
            text=current_answer,
        )
        steps.append(step)

        if passed and self.early_stop:
            break

        if i < self.max_iterations - 1:
            # Reflect and improve
            try:
                current_answer = self._reflect(task, current_answer, critique_text, provider)
            except Exception as exc:  # noqa: BLE001
                logger.warning("ReflexionAgent: reflect failed at iter %d: %s", i, exc)
                break

    return PatternResult(
        answer=current_answer,
        steps=steps,
        iterations=len(steps),
    )

Critique Format

The internal LLM is prompted to produce:

VERDICT: PASS|FAIL
ISSUES: <comma-separated list or "none">
SUGGESTION: <improvement hint>

If the LLM does not follow this format, the critique is treated as PASS (graceful degradation).

When to Use

Situation Recommendation
Answer quality is critical Reflexion
Need iterative self-correction Reflexion
Real-time tool calls needed ReAct
Multiple data sources MultiAgentOrchestrator