Agents

Token accounting

Preflight budget checks + reconciled usage. Honest cost accounting that survives sub-agent spawns.

Vadyl tracks LLM token usage at every level — per agent run, per plan step, per model invocation, per sub-agent. Budgets are enforced before tokens are spent (preflight) and reconciled after (actuals). This is how the platform turns "agents are unboundedly expensive" into "agents have predictable costs."

Budget composition

Effective budget per run is the intersection of:

Agent definition default (declared in agent.define).
Caller override (passed to agent.run({ budget: ... })).
Parent residual (for sub-agents, what's left of the parent's budget).
Project quota (the canonical billing-substrate cap).

The intersection is the smallest of these — never the largest. Sub-agents can never exceed the parent's residual.

Preflight check

Before any model invocation, ITokenAccountingService.PreflightAsyncconsults the resolved budget. If the projected token spend would exceed the budget, the invocation fails closed with BudgetExceededException — the model is never called.

// Inside the agent's invocation path
const preflight = await tokenAccounting.PreflightAsync({
  agentRunId,
  modelId,
  estimatedInputTokens:  request.estimatedTokens,
  estimatedOutputTokens: request.maxOutputTokens,
});
if (!preflight.allowed) throw new BudgetExceededException(preflight.reason);

Reconciliation

After the model returns, RecordUsageAsync records actual input + output token counts and reconciles the run's cumulative usage. The next preflight check sees the updated residual.

Sub-agent residual

When an agent spawns a sub-agent, the parent passes its remaining budget into the sub-agent's preflight chain. The sub-agent cannot widen — only narrow. When the sub-agent returns, its actual usage is added back to the parent's reconciled total.

Per-model cost rates

Each model adapter publishes input + output cost rates. Vadyl converts token counts to currency and emits canonical billing usage events tagged with the model id, agent id, run id, and step id. Reports roll up by any of these dimensions.

Inspect usage

vadyl agents runs show <runId>
# Output includes:
#   tokens.input:    12_345
#   tokens.output:   8_902
#   tokens.budget:   50_000
#   tokens.spent:    21_247
#   tokens.residual: 28_753
#   cost.usd:        0.234

vadyl analytics report TokenSpendByAgent --since 30d

Quota integration

Token usage is one dimension of the canonical project quota substrate. Set quotas at the project level, watch them enforce across all agent runs, see overruns flagged in the dashboard.

MCP integration

REST reference