Testing Agent Patterns¶
brain.patterns.testing provides first-class utilities for writing deterministic, fast, zero-API-key tests against any pattern.
Why test patterns?¶
Real LLM providers are slow, costly, and non-deterministic. The testing module gives you:
MockProvider— replay fixed strings as if they were LLM responsesRecordingProvider— wrap any provider and capture every call for inspectionAgentHarness— a one-liner context manager with built-in assertions- Assertion helpers —
assert_ok,assert_answer_contains,assert_tool_called, etc.
All utilities run offline — no API key required.
MockProvider¶
from brain.patterns import ReActAgent
from brain.patterns.testing import MockProvider
provider = MockProvider(responses=[
"Thought: I should search\nAction: search\nAction Input: RAG",
"Final Answer: RAG stands for Retrieval-Augmented Generation.",
])
agent = ReActAgent(
tools={"search": lambda q: f"[result: {q}]"},
provider=provider,
max_steps=4,
)
result = agent.run("What is RAG?")
assert result.ok
print(provider.call_count) # → 2
print(provider.call_log[0]) # → {"messages": [...], "tools": [...], ...}
Key behaviours:
| Feature | Details |
|---|---|
| Fixed responses | Cycles through responses list |
cycle=True |
Wraps around; cycle=False (default) repeats last response |
call_count |
Number of generate() calls so far |
call_log |
Full argument list for every call |
reset() |
Clears count and log without changing responses |
add_response(text) |
Append a response to the queue at runtime |
RecordingProvider¶
Wraps any real (or mock) provider and records all interactions:
from brain.patterns import ReflexionAgent
from brain.patterns.testing import MockProvider, RecordingProvider
inner = MockProvider(responses=["Draft.", "Refined draft."])
recorder = RecordingProvider(inner)
agent = ReflexionAgent(provider=recorder, max_iterations=2)
result = agent.run("Explain fine-tuning")
print(len(recorder.records)) # → 2 (one per LLM call)
print(recorder.total_input_tokens) # → 0 with MockProvider (real provider: actual counts)
Useful for cost auditing and snapshot testing.
Assertion helpers¶
from brain.patterns.testing import (
assert_ok,
assert_answer_contains,
assert_tool_called,
assert_steps_taken,
assert_iterations,
)
assert_ok(result)
assert_answer_contains(result, "rag", "retrieval") # case-insensitive
assert_tool_called(result, "search") # checks result.steps[*].action
assert_steps_taken(result, min_steps=1, max_steps=5)
assert_iterations(result, min_iters=2)
All functions raise AssertionError with a descriptive message on failure — compatible with pytest, unittest, and any test runner.
AgentHarness¶
The highest-level API — bundle everything into a context manager:
from brain.patterns.testing import AgentHarness
def test_search_agent():
with AgentHarness(responses=["Final Answer: RAG uses retrieval."]) as h:
result = h.react("Explain RAG", tools={"search": lambda q: "[result]"})
h.assert_ok()
h.assert_answer_contains("rag")
Pattern factory methods on AgentHarness:
| Method | Pattern |
|---|---|
h.react(task, tools=..., max_steps=5) |
ReActAgent |
h.rag(task, corpus=..., top_k=3) |
RAGAgent |
h.reflexion(task, max_iterations=2) |
ReflexionAgent |
h.plan(task, max_plan_steps=3) |
PlanAndExecuteAgent |
Assertion methods:
| Method | Checks |
|---|---|
h.assert_ok(result=None) |
result.ok == True |
h.assert_answer_contains(result_or_kw, *kws) |
Keywords in answer |
h.assert_tool_called(tool_name, result=None) |
Tool appears in steps |
h.assert_call_count(n) |
Mock called exactly n times |
When result is omitted, assertion methods use h.last_result.
Streaming capture¶
from brain.patterns import StreamingReActAgent, EventKind
from brain.patterns.testing import MockProvider, capture_events
provider = MockProvider(responses=["Final Answer: Done!"])
agent = StreamingReActAgent(tools={}, provider=provider, max_steps=3)
events = capture_events(agent, "Summarise AI safety")
final = next(e for e in events if e.kind == EventKind.FINAL)
print(final.text) # "Done!"
Complete test example¶
# tests/test_my_agent.py
import pytest
from brain.patterns.testing import AgentHarness, MockProvider, assert_answer_contains
def test_react_finds_answer():
with AgentHarness(responses=[
"Thought: search\nAction: web\nAction Input: transformer",
"Final Answer: Transformers use self-attention mechanisms.",
]) as h:
result = h.react("What is a transformer?", tools={"web": lambda q: "[wiki]"})
h.assert_ok()
h.assert_answer_contains("attention")
def test_rag_cites_corpus():
corpus = ["The API rate limit is 100 req/min.", "Auth uses OAuth2 with JWT."]
with AgentHarness(responses=["Based on context, OAuth2 is used with JWT tokens."]) as h:
result = h.rag("What auth method is used?", corpus=corpus)
h.assert_ok()
h.assert_answer_contains("oauth2", "jwt")
API Reference¶
brain.patterns.testing.MockProvider
¶
Bases: LLMProvider
Deterministic provider that replays a fixed list of responses.
Each generate() call consumes one response from the responses
list (cycling if cycle=True). Responses may contain ReAct-style
Thought/Action/Final Answer strings — patterns parse them exactly
as they would parse real LLM output.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
responses
|
list[str] | None
|
List of strings returned by successive |
None
|
cycle
|
bool
|
If |
False
|
model_name
|
str
|
Provider name reported via |
'mock'
|
Source code in brain/patterns/testing.py
generate
¶
generate(messages: list[dict[str, Any]], tools: list[dict[str, Any]], tool_choice: str = 'auto', stream: bool = False, timeout_s: int = 60) -> ProviderResult
Source code in brain/patterns/testing.py
generate_stream
¶
generate_stream(messages: list[dict[str, Any]], tools: list[dict[str, Any]], tool_choice: str = 'auto', timeout_s: int = 60) -> Iterator[StreamEvent]
Source code in brain/patterns/testing.py
reset
¶
brain.patterns.testing.RecordingProvider
¶
Bases: LLMProvider
Wraps any provider and records every call and result.
Useful for snapshot testing and cost auditing.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
provider
|
LLMProvider
|
The real (or mock) provider to delegate to. |
required |
Source code in brain/patterns/testing.py
generate
¶
generate(messages: list[dict[str, Any]], tools: list[dict[str, Any]], tool_choice: str = 'auto', stream: bool = False, timeout_s: int = 60) -> ProviderResult
Source code in brain/patterns/testing.py
brain.patterns.testing.AgentHarness
¶
High-level test harness — batteries included.
Bundles a :class:MockProvider with factory methods for every pattern
and assertion helpers on self.
Usage::
def test_my_agent():
with AgentHarness(responses=["Final Answer: 42"]) as h:
result = h.react("What is 6×7?", tools={"calc": lambda q: "42"})
h.assert_ok(result)
h.assert_answer_contains(result, "42")
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
responses
|
list[str] | None
|
Forwarded to :class: |
None
|
cycle
|
bool
|
Forwarded to :class: |
False
|
Source code in brain/patterns/testing.py
last_result
property
¶
The most recent result returned by any factory method.
react
¶
Run a :class:~brain.patterns.ReActAgent and return its result.
Source code in brain/patterns/testing.py
rag
¶
Run a :class:~brain.patterns.RAGAgent over corpus.
Source code in brain/patterns/testing.py
reflexion
¶
Run a :class:~brain.patterns.ReflexionAgent.
Source code in brain/patterns/testing.py
plan
¶
Run a :class:~brain.patterns.PlanAndExecuteAgent.
Source code in brain/patterns/testing.py
assert_ok
¶
Assert the most recent (or specified) result is ok.
Source code in brain/patterns/testing.py
assert_answer_contains
¶
Assert answer contains keywords.
Accepts either (result, kw1, kw2) or (kw1, kw2) (uses last
result).
Source code in brain/patterns/testing.py
assert_tool_called
¶
Assert that tool_name was called in the result.
Source code in brain/patterns/testing.py
assert_call_count
¶
Assert the mock provider was called exactly expected times.
Source code in brain/patterns/testing.py
brain.patterns.testing.assert_ok
¶
Assert that a pattern result completed without error.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
result
|
Any
|
Any :class: |
required |
Raises:
| Type | Description |
|---|---|
AssertionError
|
if |
Source code in brain/patterns/testing.py
brain.patterns.testing.assert_answer_contains
¶
Assert that the answer contains all given keywords (case-insensitive).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
result
|
Any
|
Pattern result with an |
required |
*keywords
|
str
|
One or more strings that must appear in the answer. |
()
|
Raises:
| Type | Description |
|---|---|
AssertionError
|
listing all missing keywords. |
Source code in brain/patterns/testing.py
brain.patterns.testing.assert_tool_called
¶
Assert that a specific tool was invoked during the run.
Checks result.steps for any step whose action matches
tool_name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
result
|
Any
|
Pattern result with a |
required |
tool_name
|
str
|
Name of the expected tool. |
required |
Raises:
| Type | Description |
|---|---|
AssertionError
|
if the tool was never called. |
Source code in brain/patterns/testing.py
brain.patterns.testing.assert_steps_taken
¶
Assert that the number of steps is within the expected range.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
result
|
Any
|
Pattern result with |
required |
min_steps
|
int
|
Minimum number of steps expected (inclusive). |
0
|
max_steps
|
int | None
|
Maximum number of steps expected (inclusive, or |
None
|
Raises:
| Type | Description |
|---|---|
AssertionError
|
if the count is outside the range. |
Source code in brain/patterns/testing.py
brain.patterns.testing.assert_iterations
¶
Assert that result.iterations is within the expected range.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
result
|
Any
|
Pattern result with an |
required |
min_iters
|
int
|
Minimum iterations expected (inclusive). |
1
|
max_iters
|
int | None
|
Maximum iterations (inclusive, or |
None
|
Raises:
| Type | Description |
|---|---|
AssertionError
|
if iterations is outside the range. |
Source code in brain/patterns/testing.py
brain.patterns.testing.capture_events
¶
Collect all events from a :class:~brain.patterns.StreamingReActAgent.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
agent
|
Any
|
A |
required |
task
|
str
|
The task string to pass to |
required |
Returns:
| Type | Description |
|---|---|
list[Any]
|
List of :class: |