BudgetGuard
BudgetGuard is a context manager that tracks cumulative token and USD usage for every LLM call made inside its with block.
Parameters
| Parameter | Type | Description |
|---|---|---|
user_id | str | Identifier for the budget owner. Included in BudgetExceededError messages. |
token_limit | int | None | Max total tokens (input + output). None means no limit. |
usd_limit | float | None | Max total USD cost. None means no limit. |
None, actguard still tracks usage, but never raises.
Post-context properties
After thewith block exits, guard.tokens_used and guard.usd_used are still readable:
Limits
How limits are checked
Limits are checked before and after every LLM call:- Pre-check — if the accumulated usage already meets or exceeds the limit at the start of a call,
BudgetExceededErroris raised immediately. The SDK call is never made. - Post-check — after the response (or stream) is fully read, usage is added and the limit is checked again.
Token counting
Tokens are counted asinput_tokens + output_tokens reported by the provider. The exact fields vary by provider — actguard normalises them internally.
USD cost
USD cost is computed from a built-in pricing table keyed by provider and model name. If a model is not in the table, cost is recorded as$0.00 and a UserWarning is emitted. You can open an issue or PR to add missing models.
Context isolation
Budget state lives in aContextVar. This gives strong isolation guarantees:
- Threads — each thread has its own context; concurrent guards don’t bleed into each other.
- Async tasks — each
asynciotask inherits a copy of the context at creation time. A guard started in one task is invisible to a sibling task. - Nested guards — inner guards don’t inherit the outer guard’s accumulated totals.
Tool runtime context
RunContext provides per-run state for tool decorators that need cross-call memory.
What state is stored
- Attempt counters per tool (
max_attempts) - Idempotency records per
(tool_id, idempotency_key)(idempotent) - Run identifier used by related exceptions
Isolation semantics
- Per run: a new
RunContextstarts with fresh attempt counters and idempotency state. - Nested contexts: inner
RunContexthas independent state; on exit, outer state is restored. - Async support:
RunContextsupportswithandasync with.
timeout does not require RunContext, but if one is active it includes run_id in ToolTimeoutError.
Chain-of-custody session
session provides chain-of-custody state for prove and enforce.
What state is stored
- Verified facts minted by
@prove - Session id and scope dimensions used to isolate fact visibility
Isolation semantics
- Per session: facts minted in session A are not visible in session B.
- Per scope: scope dimensions (for example user id) affect fact visibility.
- Async support:
sessionsupports bothwithandasync with.
Storage behavior
- State is in-memory and process-local.
- State is ephemeral and cleared on process restart.
- This local store is suitable for single-process agents; gateway reporting is used for global visibility.
Patching
BudgetGuard.__enter__() calls patch_all() once per process, which monkey-patches the transport layer of each installed LLM SDK. The patch is idempotent — calling it multiple times has no effect.
The patch is transparent: if no BudgetGuard is active in the current context, patched methods behave exactly like the originals.
See the Integrations section for provider-specific details and version requirements.
BudgetExceededError
Raised when a limit is hit. Inherits fromException.
