Token accounting
Preflight budget checks + reconciled usage. Honest cost accounting that survives sub-agent spawns.
Vadyl tracks LLM token usage at every level — per agent run, per plan step, per model invocation, per sub-agent. Budgets are enforced before tokens are spent (preflight) and reconciled after (actuals). This is how the platform turns "agents are unboundedly expensive" into "agents have predictable costs."
Budget composition
Effective budget per run is the intersection of:
- Agent definition default (declared in
agent.define). - Caller override (passed to
agent.run({ budget: ... })). - Parent residual (for sub-agents, what's left of the parent's budget).
- Project quota (the canonical billing-substrate cap).
The intersection is the smallest of these — never the largest. Sub-agents can never exceed the parent's residual.
Preflight check
Before any model invocation, ITokenAccountingService.PreflightAsyncconsults the resolved budget. If the projected token spend would exceed the budget, the invocation fails closed with BudgetExceededException — the model is never called.
// Inside the agent's invocation path
const preflight = await tokenAccounting.PreflightAsync({
agentRunId,
modelId,
estimatedInputTokens: request.estimatedTokens,
estimatedOutputTokens: request.maxOutputTokens,
});
if (!preflight.allowed) throw new BudgetExceededException(preflight.reason);Reconciliation
After the model returns, RecordUsageAsync records actual input + output token counts and reconciles the run's cumulative usage. The next preflight check sees the updated residual.
Sub-agent residual
When an agent spawns a sub-agent, the parent passes its remaining budget into the sub-agent's preflight chain. The sub-agent cannot widen — only narrow. When the sub-agent returns, its actual usage is added back to the parent's reconciled total.
Per-model cost rates
Each model adapter publishes input + output cost rates. Vadyl converts token counts to currency and emits canonical billing usage events tagged with the model id, agent id, run id, and step id. Reports roll up by any of these dimensions.
Inspect usage
vadyl agents runs show <runId>
# Output includes:
# tokens.input: 12_345
# tokens.output: 8_902
# tokens.budget: 50_000
# tokens.spent: 21_247
# tokens.residual: 28_753
# cost.usd: 0.234
vadyl analytics report TokenSpendByAgent --since 30dQuota integration
Token usage is one dimension of the canonical project quota substrate. Set quotas at the project level, watch them enforce across all agent runs, see overruns flagged in the dashboard.