TheRouter v2.0: Unified Async Media Platform for Image, Audio, and Video
TheRouter v2.0 introduces a unified async paradigm across image, audio, and video generation. Long-running upstream calls move off the request thread, into a job-based flow with S3-backed artifacts and a single /v1/jobs/:id polling surface.

TheRouter v2.0 ships a unified async paradigm for media generation. Image, audio, and video models that take longer than CDN edge timeouts can now run through a job-based flow — one consistent surface, three modalities, zero behavior change for existing sync clients.
What changed
Add ?async=true to a media request and TheRouter responds with HTTP 202
plus a polling URL:
curl -X POST https://api.therouter.ai/v1/images/generations?async=true \
-H "Authorization: Bearer ***" \
-d '{"model":"openai/gpt-image-2","prompt":"a sunset over Kyoto"}'
# 202 Accepted
# {
# "id": "img_01J...",
# "polling_url": "https://api.therouter.ai/v1/jobs/img_01J...",
# "status": "queued",
# "expires_at": "2026-05-15T12:00:00Z",
# "model": "openai/gpt-image-2"
# }
Poll /v1/jobs/:id until status: succeeded, then read unsigned_urls[]
for the S3-hosted artifacts. The same flow works for /v1/videos,
/v1/audio/speech, and /v1/audio/transcriptions.
Why this matters
The trigger was a real billing incident: a long-running image generation hit CloudFront's 524 timeout while still consuming upstream credit, leaving customers charged for a result they never received. The async flow severs the customer's HTTP connection from the upstream lifecycle entirely.
Three explicit guarantees in the new billing model:
- Reserve at 202 — credits are held, not charged, the moment the job is accepted.
- Settle on upstream completion — actual cost is recorded only when the provider returns success.
- Refund only on upstream failure — if the provider fails, the reservation is released. Customer-side disconnects, network blips, or browser closes do not cause unintended refunds, because the upstream call has already consumed real cost.
Cost and storage controls
Artifacts auto-expire after 30 days in S3. Per-tenant storage quotas
(driven by tenants.tier) and cross-modality concurrency caps prevent
runaway usage from a single account. The v1.x sync surface is unchanged
and continues to serve clients that don't pass ?async=true.
Sora 2 and webhooks
POST /v1/videos is re-enabled. Sora 2 is wired behind a feature flag,
pending OpenAI organization verification. Webhook delivery for job
completion is the next priority — currently job status is poll-only.
Try it
- Guide: Async Media
- Reference: Jobs API
- Reference: Audio API
Related
Latest AI News →
DeepSeek Now Speaks Anthropic: What the New Dual-Format API Means for Your Routing Layer
DeepSeek's API now accepts Anthropic SDK format at api.deepseek.com/anthropic — meaning Claude Code, the Anthropic Python/TS SDK, and any Anthropic-native client can now route requests to DeepSeek V4 models without an OpenAI wrapper.

Anthropic Acquires Stainless: What SDK Consolidation Means for Multi-Provider API Teams
Anthropic has acquired Stainless, the company that generates every official Claude SDK and MCP server tooling. For teams building multi-provider API pipelines, this reshapes SDK dependency risk, MCP server governance, and the pace of Claude API surface changes.

Kimi K2.6: Moonshot's Latest Open-Source Model Sets a New Bar for Long-Horizon Coding Agents
Moonshot AI releases Kimi K2.6 with state-of-the-art long-horizon coding, multimodal input (text, images, video), 256K context, and a fully OpenAI-compatible API — directly affecting how engineering teams route coding-agent workloads.