AI as a native plane

Model selection that declares its needs.

An agent author declares what they need: tier (frontier / mid / fast), required capabilities (vision, tool-use, JSON mode, structured output), context window, latency budget. ILlmCapabilityRegistry knows what every adapter offers. ILlmModelRouter resolves through the project's LlmConnectionBinding chain with a DeprecationFallbackPolicy. Same declarative shape as DbCapabilities — model selection is not magical, it's matched.

About agents Read the architecture

The four built-in adapters

Anthropic. OpenAI. DeepSeek. Moonshot. Peer adapters.

ModelInferenceAdapter = 8

Hosted-LLM access lives canonically as CapabilitySurfaceKind.ModelInferenceAdapter. Built-in adapters live exclusively under Vadyl.Connectors.NativeAdapter/ModelInference/<Provider>/V<N>/. No SDK leak past the boundary.

Capability declarations

Each adapter declares per-model capabilities — context window, modalities, tool-use shape, structured output, prompt cache, vision. Routers match requirements against declared caps deterministically.

Typed failure classification

ModelInvocationFailureKind. AuthenticationFailed, RateLimited, Overloaded, ProviderTimeout, ContextWindowExceeded, ContentFiltered. Never Message.Contains — anti-pattern #89 codified.

Deprecation fallback policy

When the requested model is deprecated or returns an error class the policy considers fallback-eligible, the router walks to the next eligible model in the binding chain. Operators see the swap; never silent.

Token accounting preflight

ITokenAccountingService.PreflightAsync runs BEFORE dispatch. Budget intersection across (definition default, caller override, parent residual for sub-agents). Refused if intersection is empty.

Reconciled usage

RecordUsageAsync reconciles after dispatch. Actual tokens counted from the provider response. Drives billing through the canonical UsageEvent ledger. Rolls up to ProjectQuota.

GovernedConnection layer

From authored code, hosted-LLM access still flows through a GovernedConnection of type Llm — for project-scoped binding ergonomics. The adapter itself is the canonical UCSA kind.

Self-hosted via Runtime Fabric

vLLM, Ollama, GPU pools, on-prem inference clusters land canonically through CapabilitySurfaceKind.RuntimeSubstrate with the same scaling and vertical resource policy. No separate kind for self-hosted — same shape, different substrate.

Multi-language adapter authoring

New model providers can ship as built-in native, declarative bundle, or authored Wasm component. WIT contracts ensure conformance regardless of source language.

Built-in adapters

Anthropic · OpenAI · DeepSeek · Moonshot

Tier

+ requirements

Frontier / mid / fast + per-cap

Typed

Failure classification

Never Message.Contains

Preflight

+ reconciled

Budget intersection, real usage

Routing that declares its needs.

Author the agent. Declare the requirements. Vadyl resolves the right model through the binding chain — with budget intersection, with deprecation fallback, with typed failure handling.

About agents Knowledge corpora & RAG