Bring Your Own Key • GA

Your Keys. Our Infrastructure.

Use your own provider accounts. Get health-aware routing, automatic failover, and full usage visibility without giving up control.

Keep your existing provider relationships and pricing. TheRouter adds reliability and observability on top — your keys, your costs, our network.

OpenAI
Anthropic
Google
Your Keys
75+Models
4Providers
2M+Max Context
<200msRouting Latency
3 linesTo Integrate

Powering AI across leading model brands

Anthropic
OpenAI
Google
xAI

Why TheRouter

Not just an API proxy — a control plane that makes your AI stack more reliable, transparent, and flexible.

1 line
to switch providers

Zero Lock-in

Keep your provider accounts and pricing. Switch models or providers by changing one string — no migration, no rewrite.

  • OpenAI SDK compatible — works with Cursor, Claude Code, any client
  • Bring your own API keys, keep your existing contracts
  • Add or drop a provider in seconds, not sprints
3 controls
fallback, objective, basket

Smart Routing

Automatic failover, provider health tracking, and approved cost-aware routing keep requests moving without hiding what the router did.

  • Automatic failover when a provider goes down
  • Approved model-basket optimization (Beta)
  • Measured routing evidence in request logs and analytics
100%
spend transparency

Full Visibility

See exactly where every token goes. Track usage by team, key, or model — and set limits before you get a surprise bill.

  • Real-time token & cost tracking per API key
  • Spending limits and budget alerts
  • Full audit trail for every request

Cut AI Spend With Guardrails

TheRouter helps you compare baseline cost vs selected route, shadow-test cheaper approved models, and prove savings in logs and analytics.

Beta

Approved Model Basket

Pick a baseline model, then approve cheaper alternatives for the same workload. Start in shadow mode, then opt into live cost routing only for that basket.

Baseline: Claude Sonnet 4.6 → Approved alt: GPT-4.1 mini
GA

Prompt Caching

Long repeated instructions can bill at cached rates when the upstream route supports it. This is especially useful for agents and workflows with heavy system prompts.

Cached prompt input can be ~10x cheaper than uncached input
GA

Measured Savings Evidence

Request logs and Activity show baseline charge, selected route, realized savings, and shadow-mode recommendations so teams can verify what changed.

Logs: baseline vs selected vs saved, per request
Illustrative example: support triage team
Before: every request stays on one premium baseline model
After: baseline stays protected, a cheaper approved model runs in shadow first, then goes live for eligible flows
Illustrative: 20-35% lower model spend

Illustrative example only. Actual savings depend on your approved model basket, traffic mix, and prompt shape.

Up and Running in Minutes

Change one line. Get access to every major AI model with built-in reliability.

Point Your SDK

Swap your base URL to TheRouter. Works with any OpenAI-compatible client — Cursor, Claude Code, LangChain, your own app.

We Apply Your Policy

TheRouter applies provider health, fallback rules, and your routing objective. If the first route fails, it moves to the next approved path automatically.

You Get Results

Same response format you already use. Plus usage tracking, cost visibility, and team controls — with zero extra code.

Your AppTheRouter.aiPolicy + FailoverAnthropicOpenAIGoogle

Featured Models

Access top-tier models from leading providers through a single unified API.

View all models
anthropic1,000,000 ctx
Claude Sonnet 4.6
Input
$3.60 / 1M tokens
Output
$18.00 / 1M tokens
Capabilities
textimagepdf
openai128,000 ctx
GPT-4o
Input
$3.00 / 1M tokens
Output
$12.00 / 1M tokens
Capabilities
textimage
google1,048,576 ctx
Gemini 2.5 Pro
Input
$1.50 / 1M tokens
Output
$12.00 / 1M tokens
Capabilities
textimagepdf

Start building in 3 lines of code

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.therouter.ai/v1",
  apiKey: "sk-your-key",
});

const response = await client.chat.completions.create({
  model: "anthropic/claude-sonnet-4.6",
  messages: [{ role: "user", content: "Hello!" }],
});