Limits and quotas
Request limits, batch limits, realtime limits, workflow deadlines, runtime budgets, webhook retry ceilings, token budgets, and quota enforcement.
Limits are not scattered middleware settings. They are quota and policy descriptors tied to project scope, operation class, capability surface, publication version, and usage metering. Enterprise projects can raise many limits through governance-approved quota grants; fail-closed limits remain hard boundaries.
Limit catalog
| Limit | Default | Scope | Enforcement |
|---|---|---|---|
| REST request body | 10 MiB JSON; larger payloads use Storage or SourceAsset upload. | Request | 413 or VALIDATION_FAILED before business execution. |
| Entity list page size | 200 rows by default; final-form enterprise policy may raise by grant. | Project and route | VALIDATION_FAILED when requested pageSize exceeds policy. |
| Batch mutation size | 1,000 rows or 10 MiB per operation. | Entity operation | VALIDATION_FAILED before transaction opens. |
| Idempotency key retention | 24 hours default; configurable per operation class. | Project | Duplicate keys return prior result or idempotency conflict. |
| Webhook retries | 10 attempts with exponential backoff plus jitter. | Endpoint | Delivery moves to DeadLettered after retry policy exhausts. |
| Realtime subscriptions | 1,000 per project by default; enterprise scopes use quota grants. | Project and actor | 429 QUOTA_EXCEEDED on handshake or subscribe. |
| Runtime desired instances | Per-surface and cumulative project ceilings from governance envelope. | Project, environment, surface, scale group | Compile or mutation fail-closed when desired/max exceeds policy. |
| Runtime vertical resources | CPU, memory, storage, bandwidth, and accelerators capped by project inheritance policy. | Project, environment, surface | Capability satisfaction and governance checks before realization. |
| Autoscale decision cadence | 30 seconds default with cooldown, hysteresis, stale-sample, rollout, and drain gates. | Autoscale target | Decision skipped with typed reason until the gate clears. |
| Workflow run duration | 30 days default; durable workflow policy can extend. | Workflow definition | Cancellation and compensation policy fires at deadline. |
| Edge handler CPU time | 50 ms default CPU slice, 5 s wall-clock ceiling. | Execution unit | Timed-out invocation with typed BridgeError. |
| Agent token budget | Per-agent and per-run budget; preflight plus reconciliation. | Agent run | Preflight deny or reconciled overage usage event. |
| Connector egress timeout | 30 seconds default, max set by connection policy. | Governed connection | UPSTREAM_TIMEOUT with retry classification. |
| OpenAPI/SDL/proto size | Publication descriptor size quota. | Project publication | Publication compile fails closed with descriptor diagnostics. |
| CLI descriptor cache | ETag validated on every CLI startup. | Local CLI installation | 304 reuse or descriptor refetch. |
Quota headers
HTTP/1.1 429 Too Many Requests
X-Vadyl-Quota-Kind: read.monthly
X-Vadyl-Quota-Limit: 1000000
X-Vadyl-Quota-Used: 1000001
X-Vadyl-Quota-Reset: 2026-06-01T00:00:00Z
{
"error": {
"code": "QUOTA_EXCEEDED",
"reasonCode": "Quota.ReadMonthly.Exhausted",
"retryable": false,
"correlationId": "01HXZ0J4YV8AJF2GFG2T1F7Y42"
}
}Create a quota
POST /api/Usage/{projectId}/quotas
{
"kind": "agent.tokens.monthly",
"limit": 50000000,
"mode": "hard",
"window": "calendar-month",
"dimensions": { "agent": "SupportAgent" }
}
HTTP/1.1 201 Created
{
"id": "quota_123",
"kind": "agent.tokens.monthly",
"mode": "hard",
"state": "active"
}Enforcement modes
| Mode | Behavior |
|---|---|
hard | Rejects operation before material consumption. |
soft | Allows operation, emits overage usage event, and triggers policy notifications. |
monitor | Records usage and warnings only. |
reservation | Pre-reserves capacity before execution and reconciles actual usage after completion. |
Budgeted operations
Agent runs, model invocations, workflow runs, distribution materialization, runtime scaling, vertical resource changes, analytics queries, and storage uploads can reserve budget before execution. Reservation failure returns a typed error without partially starting the operation.
Runtime resource budgets
Runtime Fabric enforces both per-surface and cumulative project ceilings: desired instances, max instances, CPU millicores, memory MiB, ephemeral and persistent storage, IOPS, bandwidth, accelerator count, autoscale strategy, load-balancing mode, protocol, and public ingress. Descendant projects inherit the effective envelope and fail closed when a scaling request exceeds it.