TurnBudget & DeadlineToken¶
File: brain/agent/turn_budget.py
TurnBudget¶
TurnBudget is the explicit resource envelope for one agent turn. It holds:
- An absolute wall-clock deadline (
time.perf_counter() + timeout_s) - Step, tool-call, and reflection allowance counters (thread-safe atomic)
- A context token ceiling
All counters are protected by an internal threading.Lock so the same budget
object can be shared across parallel tool-execution threads without data races.
Factory¶
budget = TurnBudget.create(
timeout_s=60,
max_steps=6, # LLM generation iterations
max_tool_calls=None, # defaults to max_steps
max_reflections=4,
max_context_tokens=200_000,
)
create() is the only intended construction path. It sets
deadline = time.perf_counter() + timeout_s.
Deadline Helpers¶
budget.remaining_s() # → float: seconds until deadline (0 if expired)
budget.is_expired() # → bool: True when past deadline
budget.tool_deadline(cap_s=45) # → float: absolute deadline for one tool call
# = min(turn_deadline, now + cap_s)
budget.per_tool_remaining_s(45) # → float: seconds for one tool (min 5.0, capped at cap_s)
tool_deadline() and per_tool_remaining_s() ensure a single tool cannot
claim more time than the turn has left, while also respecting the per-tool cap
configured in the tool's metadata.
Allowance Claims (Thread-Safe)¶
All claim methods atomically increment and check the counter. They return False
when the allowance is exhausted.
budget.claim_step() # → bool: False when max_steps exhausted
budget.claim_tool_call() # → bool: False when max_tool_calls exhausted
budget.claim_reflection() # → bool: False when max_reflections exhausted
Usage pattern in the harness:
while budget.claim_step(): # each LLM iteration
if budget.is_expired(): # wall-clock check
break
...
# Before reflection:
if not budget.claim_reflection():
return stub_result
Observability¶
budget.snapshot()
# → {
# "remaining_s": 42.7,
# "expired": False,
# "steps_used": 2,
# "steps_max": 6,
# "tool_calls_used": 3,
# "tool_calls_max": 6,
# "reflections_used": 0,
# "reflections_max": 4,
# }
snapshot() returns a copy; it holds the lock briefly and is safe to call
from any thread.
DeadlineToken¶
DeadlineToken is a lightweight cooperative-cancellation handle. It is created
once per tool call and passed into RunContext so well-behaved tools can poll
token.is_expired() and exit early without waiting for the hard thread timeout.
Construction¶
token = DeadlineToken.from_budget(budget, cap_s=45)
# Creates a token that expires at min(turn_deadline, now + 45s)
API¶
token.is_expired() # → bool: True if deadline passed or cancel() called
token.remaining_s() # → float: seconds until expiry
token.cancel() # Signal immediate stop (called on TimeoutError)
Cooperation Model¶
Tools that are cooperative check token.is_expired() in their inner loop and
return early with a partial result. This allows the turn to continue gracefully
rather than waiting for the hard thread-level timeout.
Tools that are non-cooperative (legacy or third-party) are still bounded:
future.result(timeout=per_tool_s) in BoundedToolExecutor enforces a hard
deadline. If the thread wins the race and completes after the timeout is caught,
state.complete_trace() returns False and all side effects are suppressed.
Timeline:
──────────────────────────────────────────────────────
t=0s Tool execution starts
t=20s DeadlineToken.is_expired() → True (cooperative tools exit here)
t=20s future.result(timeout=20) raises TimeoutError (hard thread timeout)
t=21s Tool thread completes anyway (non-cooperative)
t=21s state.complete_trace() → False (late write suppressed)
──────────────────────────────────────────────────────
Data Flow¶
TurnBudget.create(timeout_s=60)
↓
│ passed to TurnRuntimeState
│ passed to BoundedToolExecutor.execute()
│ passed to ReflectionEngine.reflect()
│
├─ budget.claim_step() ← harness loop iteration gate
├─ budget.is_expired() ← wall-clock check after each step
├─ budget.per_tool_remaining_s()← per-tool thread timeout
├─ budget.tool_deadline() ← DeadlineToken.from_budget() uses this
└─ budget.claim_reflection() ← ReflectionEngine gate
Relation to Legacy Code¶
Before (scattered pattern):
# In old harness
remaining = self.timeout_s - (time.time() - self._turn_start_time)
if remaining < 0:
break
After:
# Centralised in TurnBudget
if budget.is_expired():
break
per_tool_s = budget.per_tool_remaining_s(cap_s)
The old pattern recomputed remaining independently in at least 4 places, used
wall-clock time.time() inconsistently, and had no per-tool cap enforcement.