Reference

Limits and quotas

Request limits, batch limits, realtime limits, workflow deadlines, runtime budgets, webhook retry ceilings, token budgets, and quota enforcement.

Limits are not scattered middleware settings. They are quota and policy descriptors tied to project scope, operation class, capability surface, publication version, and usage metering. Enterprise projects can raise many limits through governance-approved quota grants; fail-closed limits remain hard boundaries.

Limit catalog

Limit	Default	Scope	Enforcement
REST request body	10 MiB JSON; larger payloads use Storage or SourceAsset upload.	Request	413 or VALIDATION_FAILED before business execution.
Entity list page size	200 rows by default; final-form enterprise policy may raise by grant.	Project and route	VALIDATION_FAILED when requested pageSize exceeds policy.
Batch mutation size	1,000 rows or 10 MiB per operation.	Entity operation	VALIDATION_FAILED before transaction opens.
Idempotency key retention	24 hours default; configurable per operation class.	Project	Duplicate keys return prior result or idempotency conflict.
Webhook retries	10 attempts with exponential backoff plus jitter.	Endpoint	Delivery moves to DeadLettered after retry policy exhausts.
Realtime subscriptions	1,000 per project by default; enterprise scopes use quota grants.	Project and actor	429 QUOTA_EXCEEDED on handshake or subscribe.
Runtime desired instances	Per-surface and cumulative project ceilings from governance envelope.	Project, environment, surface, scale group	Compile or mutation fail-closed when desired/max exceeds policy.
Runtime vertical resources	CPU, memory, storage, bandwidth, and accelerators capped by project inheritance policy.	Project, environment, surface	Capability satisfaction and governance checks before realization.
Autoscale decision cadence	30 seconds default with cooldown, hysteresis, stale-sample, rollout, and drain gates.	Autoscale target	Decision skipped with typed reason until the gate clears.
Workflow run duration	30 days default; durable workflow policy can extend.	Workflow definition	Cancellation and compensation policy fires at deadline.
Edge handler CPU time	50 ms default CPU slice, 5 s wall-clock ceiling.	Execution unit	Timed-out invocation with typed BridgeError.
Agent token budget	Per-agent and per-run budget; preflight plus reconciliation.	Agent run	Preflight deny or reconciled overage usage event.
Connector egress timeout	30 seconds default, max set by connection policy.	Governed connection	UPSTREAM_TIMEOUT with retry classification.
OpenAPI/SDL/proto size	Publication descriptor size quota.	Project publication	Publication compile fails closed with descriptor diagnostics.
CLI descriptor cache	ETag validated on every CLI startup.	Local CLI installation	304 reuse or descriptor refetch.

Quota headers

HTTP/1.1 429 Too Many Requests
X-Vadyl-Quota-Kind: read.monthly
X-Vadyl-Quota-Limit: 1000000
X-Vadyl-Quota-Used: 1000001
X-Vadyl-Quota-Reset: 2026-06-01T00:00:00Z

{
  "error": {
    "code": "QUOTA_EXCEEDED",
    "reasonCode": "Quota.ReadMonthly.Exhausted",
    "retryable": false,
    "correlationId": "01HXZ0J4YV8AJF2GFG2T1F7Y42"
  }
}

Create a quota

POST /api/Usage/{projectId}/quotas
{
  "kind": "agent.tokens.monthly",
  "limit": 50000000,
  "mode": "hard",
  "window": "calendar-month",
  "dimensions": { "agent": "SupportAgent" }
}

HTTP/1.1 201 Created
{
  "id": "quota_123",
  "kind": "agent.tokens.monthly",
  "mode": "hard",
  "state": "active"
}

Enforcement modes

Mode	Behavior
`hard`	Rejects operation before material consumption.
`soft`	Allows operation, emits overage usage event, and triggers policy notifications.
`monitor`	Records usage and warnings only.
`reservation`	Pre-reserves capacity before execution and reconciles actual usage after completion.

Budgeted operations

Agent runs, model invocations, workflow runs, distribution materialization, runtime scaling, vertical resource changes, analytics queries, and storage uploads can reserve budget before execution. Reservation failure returns a typed error without partially starting the operation.

Runtime resource budgets

Runtime Fabric enforces both per-surface and cumulative project ceilings: desired instances, max instances, CPU millicores, memory MiB, ephemeral and persistent storage, IOPS, bandwidth, accelerator count, autoscale strategy, load-balancing mode, protocol, and public ingress. Descendant projects inherit the effective envelope and fail closed when a scaling request exceeds it.

Errors

Coding environment