Auto-Routing

HAR Commands

HAR commands let you control routing behavior inline with your API requests using the provider object or model slug suffixes.

HAR Command Syntax

Send a HAR command as the model field value to control routing mode per-request. No other changes to your request are needed.

CommandEffect
HAR:AutoSwitch to balanced auto-routing (default)
HAR:QualitySwitch to quality-first routing
HAR:CostSwitch to cost-first routing
HAR:FastSwitch to latency-first routing
HAR:OffDisable HAR — use model field as a literal model ID
HAR:?Query current HAR mode (returns mode in response)
Example
{
  "model": "HAR:Cost",
  "messages": [{"role": "user", "content": "Hello"}]
}

Model Slug Suffixes

Append a suffix to any model slug to apply a routing hint:

CommandSyntaxEffect
:automodel/slug:autoBalanced routing — 40% health, 30% reputation, 20% cost, 10% latency (default)
:qualitymodel/slug:qualityHighest-reputation provider — 60% reputation, 30% health, 10% cost
:costmodel/slug:costCheapest provider — 70% cost, 20% health, 10% reputation
:fastmodel/slug:fastLowest-latency provider — 60% latency, 30% health, 10% cost
:nitromodel/slug:nitroAlias for :fast — prefer lowest-latency endpoint
:floormodel/slug:floorCheapest model in same capability tier (e.g., gpt-4o:floor → deepseek-chat)

Provider Object Commands

For finer control, use the provider field in the request body:

{
  "model": "deepseek/deepseek-chat-v3.2",
  "messages": [...],
  "provider": {
    "sort": "latency",       // "price" or "latency"
    "only": ["deepseek"],    // restrict to specific providers
    "ignore": ["glm"]        // exclude specific providers
  }
}

Fallback Chains

Use the models array to define an explicit fallback order. HAR tries each model in sequence until one succeeds:

{
  "models": [
    "deepseek/deepseek-chat-v3.2",
    "qwen/qwen3-235b",
    "glm/glm-4-plus"
  ],
  "messages": [{"role": "user", "content": "Hello"}]
}

Suffixes and provider objects can be combined. When both are present, the provider object takes precedence for conflicting settings.