Send Chat Completion Request
Create an OpenAI-compatible chat completion.
POST/v1/chat/completions
Request Parameters
| Name | Type | Required | Description |
|---|---|---|---|
model | string | Primary model alias in brand/model format. Required unless models is provided. | |
models | string[] | Fallback model list. Use when model is omitted, or combine with model for explicit fallback chains. | |
messages | array | Conversation messages. Each item requires role; content may be string or structured content parts. Required unless input is provided. | |
input | string | array | Responses API input. String or array of input items. When provided instead of messages, auto-translated to Chat Completions format internally. | |
stream | boolean | Optional. Defaults to false. | |
temperature | number | Sampling temperature. Typical range: 0-2. | |
max_tokens | integer | Maximum output tokens. | |
top_p | number | Nucleus sampling. Typical range: 0-1. | |
tools | array | Tool definitions. | |
tool_choice | object | string | Tool selection mode. | |
response_format | object | Structured response constraints. | |
reasoning | object | Reasoning controls. | |
logprobs | boolean | Include token logprobs. | |
top_logprobs | integer | Top logprobs count. | |
logit_bias | object | Token bias map. | |
max_completion_tokens | integer | Maximum completion tokens. | |
stream_options | object | Streaming options. Set stream_options.include_usage=true to receive a separate final usage chunk. | |
stop | string | string[] | Stop sequences. | |
frequency_penalty | number | Frequency penalty. | |
presence_penalty | number | Presence penalty. | |
seed | integer | Deterministic sampling seed. | |
user | string | End-user identifier for analytics and abuse controls. |
OpenRouter Routing Extensions
| Name | Type | Required | Description |
|---|---|---|---|
models | string[] | Model fallback chain. The gateway composes a route and tries models in order. | |
provider | object | Provider preferences: allow_fallbacks, order, only, ignore. | |
route | string | Routing strategy: fallback or sort. |
Examples (Non-Streaming)
curl
curl -X POST https://api.therouter.ai/v1/chat/completions
-H "Authorization: Bearer $THEROUTER_API_KEY"
-H "Content-Type: application/json"
-d '{
"model": "anthropic/claude-sonnet-4.5",
"messages": [{"role": "user", "content": "Write a haiku about routing."}],
"temperature": 0.7,
"max_tokens": 180,
"top_p": 1,
"presence_penalty": 0,
"frequency_penalty": 0,
"stop": ["\n\n"],
"seed": 42,
"models": ["anthropic/claude-sonnet-4.5", "openai/gpt-4o"],
"provider": {"allow_fallbacks": true, "order": ["openai-api"]},
"route": "fallback",
"stream": false
}'Examples (Streaming)
curl
curl --no-buffer -X POST https://api.therouter.ai/v1/chat/completions
-H "Authorization: Bearer $THEROUTER_API_KEY"
-H "Content-Type: application/json"
-d '{
"model": "openai/gpt-4o",
"messages": [{"role": "user", "content": "Stream a short answer."}],
"stream": true,
"stream_options": {"include_usage": true}
}'Streaming Behavior
text
data: {"id":"chatcmpl_...","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}
data: {"id":"chatcmpl_...","object":"chat.completion.chunk","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}
data: {"id":"chatcmpl_...","object":"chat.completion.chunk","choices":[],"usage":{"prompt_tokens":10,"completion_tokens":4,"total_tokens":14}}
data: [DONE]Usage is emitted as a separate final chunk with choices: [] only when stream_options.include_usage is true.
Error Shape
json
{
"error": {
"message": "Missing required parameter: model",
"type": "invalid_request_error",
"code": 400,
"param": "model"
}
}Error bodies do not include an error.metadata object. Request IDs are returned via x-request-id response header.
Response Headers
| Name | Type | Required | Description |
|---|---|---|---|
X-Cache | response header | Cache status: HIT or MISS. | |
X-Cache-Model | response header | Upstream model used for cache hit. | |
x-request-id | response header | Request trace ID (UUIDv4). | |
X-RateLimit-Limit | response header | Request quota window limit. | |
X-RateLimit-Remaining | response header | Remaining requests in the current window. | |
X-RateLimit-Reset | response header | Unix epoch reset timestamp. | |
Retry-After | response header | Present on 429 responses. Value is seconds. |
Model Requirement
Provide either
model or models. Requests without both are invalid.