April 24, 2026·New Models·中文版本 →

DeepSeek V4 Now Available on TheRouter — Direct API Integration

DeepSeek released V4 Flash and V4 Pro today — their most powerful open-source models to date. Both are already live on TheRouter with day-one support.

V4 Flash — Best Value for Everyday Tasks

284B MoE, 13B active — Mixture of Experts architecture with only 13B parameters active per forward pass, keeping inference fast and cost low.
1M context, 384K max output — process entire codebases or long documents in a single request with massive output capacity.
Default thinking mode — built-in chain-of-thought reasoning enabled by default for better accuracy.
$0.14 / $0.28 per MTok (input/output) — among the most affordable reasoning models available.

V4 Pro — Complex Reasoning Powerhouse

1.6T MoE, 49B active — the largest open-source MoE model, approaching Claude Opus 4.6 non-thinking level performance.
1M context, 384K max output — same generous context and output limits as V4 Flash.
$1.74 / $3.48 per MTok (input/output) — competitive pricing for a model at this capability level.

Benchmarks

Benchmark	V4 Pro	V4 Flash	Claude Opus 4.6
SWE-bench Verified	80.6%	79.0%	80.8%
LiveCodeBench	93.5	—	—
Codeforces Rating	3206	—	—

V4 Pro leads on LiveCodeBench (93.5) and achieves the highest Codeforces rating (3206) among all models. On SWE-bench Verified, it matches Claude Opus 4.6 within 0.2%.

Architecture

Hybrid Attention Architecture — combines efficient attention mechanisms for handling both short and ultra-long sequences.
Engram conditional memory — enables efficient processing of 1M context windows without proportional compute scaling.
MoE with low active params — keeps inference costs dramatically lower than dense models of equivalent total parameter count.

Pricing

Model	Input	Output	Context
V4 Flash	$0.14/MTok	$0.28/MTok	1M
V4 Pro	$1.74/MTok	$3.48/MTok	1M

V4 Flash is one of the most cost-effective reasoning models available. V4 Pro offers frontier-level coding at a fraction of closed-source pricing.

How to Use It

Use the standard model names — TheRouter handles routing automatically:

curl https://api.therouter.ai/v1/chat/completions \
  -H "Authorization: Bearer $THE_ROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek/deepseek-v4-flash",
    "messages": [{"role": "user", "content": "Explain the MoE architecture"}],
    "max_tokens": 4096
  }'

For V4 Pro, use deepseek/deepseek-v4-pro. Both models are available on the Global endpoint (api.therouter.ai). The legacy China endpoint is retired; use the global endpoint for new integrations.

Open Source

Both V4 Flash and V4 Pro are released under the Apache 2.0 license with full model weights available on Hugging Face. You can self-host, fine-tune, or use them commercially without restrictions.

Getting Started

Already on TheRouter? Just set the model to deepseek/deepseek-v4-flash or deepseek/deepseek-v4-pro — no other changes needed.

Start for free Quickstart guide DeepSeek provider

Questions? Reach out on GitHub.