TurnBudget & DeadlineToken¶

File: brain/agent/turn_budget.py

TurnBudget¶

TurnBudget is the explicit resource envelope for one agent turn. It holds:

An absolute wall-clock deadline (time.perf_counter() + timeout_s)
Step, tool-call, and reflection allowance counters (thread-safe atomic)
A context token ceiling

All counters are protected by an internal threading.Lock so the same budget object can be shared across parallel tool-execution threads without data races.

Factory¶

budget = TurnBudget.create(
    timeout_s=60,
    max_steps=6,          # LLM generation iterations
    max_tool_calls=None,  # defaults to max_steps
    max_reflections=4,
    max_context_tokens=200_000,
)

create() is the only intended construction path. It sets deadline = time.perf_counter() + timeout_s.

Deadline Helpers¶

budget.remaining_s()              # → float: seconds until deadline (0 if expired)
budget.is_expired()               # → bool: True when past deadline
budget.tool_deadline(cap_s=45)    # → float: absolute deadline for one tool call
                                  #   = min(turn_deadline, now + cap_s)
budget.per_tool_remaining_s(45)   # → float: seconds for one tool (min 5.0, capped at cap_s)

tool_deadline() and per_tool_remaining_s() ensure a single tool cannot claim more time than the turn has left, while also respecting the per-tool cap configured in the tool's metadata.

Allowance Claims (Thread-Safe)¶

All claim methods atomically increment and check the counter. They return False when the allowance is exhausted.

budget.claim_step()        # → bool: False when max_steps exhausted
budget.claim_tool_call()   # → bool: False when max_tool_calls exhausted
budget.claim_reflection()  # → bool: False when max_reflections exhausted

Usage pattern in the harness:

while budget.claim_step():      # each LLM iteration
    if budget.is_expired():     # wall-clock check
        break
    ...
    # Before reflection:
    if not budget.claim_reflection():
        return stub_result

Observability¶

budget.snapshot()
# → {
#     "remaining_s": 42.7,
#     "expired": False,
#     "steps_used": 2,
#     "steps_max": 6,
#     "tool_calls_used": 3,
#     "tool_calls_max": 6,
#     "reflections_used": 0,
#     "reflections_max": 4,
# }

snapshot() returns a copy; it holds the lock briefly and is safe to call from any thread.

DeadlineToken¶

DeadlineToken is a lightweight cooperative-cancellation handle. It is created once per tool call and passed into RunContext so well-behaved tools can poll token.is_expired() and exit early without waiting for the hard thread timeout.

Construction¶

token = DeadlineToken.from_budget(budget, cap_s=45)
# Creates a token that expires at min(turn_deadline, now + 45s)

API¶

token.is_expired()   # → bool: True if deadline passed or cancel() called
token.remaining_s()  # → float: seconds until expiry
token.cancel()       # Signal immediate stop (called on TimeoutError)

Cooperation Model¶

Tools that are cooperative check token.is_expired() in their inner loop and return early with a partial result. This allows the turn to continue gracefully rather than waiting for the hard thread-level timeout.

Tools that are non-cooperative (legacy or third-party) are still bounded: future.result(timeout=per_tool_s) in BoundedToolExecutor enforces a hard deadline. If the thread wins the race and completes after the timeout is caught, state.complete_trace() returns False and all side effects are suppressed.

Timeline:
──────────────────────────────────────────────────────
t=0s     Tool execution starts
t=20s    DeadlineToken.is_expired() → True  (cooperative tools exit here)
t=20s    future.result(timeout=20) raises TimeoutError  (hard thread timeout)
t=21s    Tool thread completes anyway (non-cooperative)
t=21s    state.complete_trace() → False  (late write suppressed)
──────────────────────────────────────────────────────

Data Flow¶

TurnBudget.create(timeout_s=60)
    ↓
    │  passed to TurnRuntimeState
    │  passed to BoundedToolExecutor.execute()
    │  passed to ReflectionEngine.reflect()
    │
    ├─ budget.claim_step()          ← harness loop iteration gate
    ├─ budget.is_expired()          ← wall-clock check after each step
    ├─ budget.per_tool_remaining_s()← per-tool thread timeout
    ├─ budget.tool_deadline()       ← DeadlineToken.from_budget() uses this
    └─ budget.claim_reflection()    ← ReflectionEngine gate

Relation to Legacy Code¶

Before (scattered pattern):

# In old harness
remaining = self.timeout_s - (time.time() - self._turn_start_time)
if remaining < 0:
    break

After:

# Centralised in TurnBudget
if budget.is_expired():
    break
per_tool_s = budget.per_tool_remaining_s(cap_s)

The old pattern recomputed remaining independently in at least 4 places, used wall-clock time.time() inconsistently, and had no per-tool cap enforcement.