DeepSeek V3.2
API guide
Chat completion
Standard chat through TheRouter's OpenAI-compatible surface. TheRouter normalises tool-calling and response_format on top of the underlying provider β your client code stays portable across DeepSeek, Anthropic, and OpenAI.
curl https://api.therouter.ai/v1/chat/completions \
-H "Authorization: Bearer $THEROUTER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek/deepseek-v3.2",
"messages": [{"role": "user", "content": "Prove that the sum of two odd integers is even."}]
}'Streaming
Stream tokens for chat UIs. DSA does not affect streaming semantics β first-token latency is competitive with V3.1-Terminus and substantially lower than dense-attention baselines on long prompts.
curl https://api.therouter.ai/v1/chat/completions \
-H "Authorization: Bearer $THEROUTER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek/deepseek-v3.2",
"stream": true,
"messages": [{"role": "user", "content": "Explain MoE load balancing in 200 words."}]
}'Tool use
DeepSeek V3.2 supports OpenAI-shape function calling and tool_choice. Use it as the workhorse model behind tool-using agents where flagship pricing isn't justified.
curl https://api.therouter.ai/v1/chat/completions \
-H "Authorization: Bearer $THEROUTER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek/deepseek-v3.2",
"messages": [{"role": "user", "content": "What is the AWS S3 region for ap-northeast-1?"}],
"tools": [{
"type": "function",
"function": {
"name": "lookup_region",
"parameters": {"type": "object", "properties": {"code": {"type": "string"}}}
}
}]
}'Structured JSON
Constrain output to JSON via response_format. For high-stakes extraction, also include a schema in the system prompt β V3.2 follows explicit shape contracts reliably.
curl https://api.therouter.ai/v1/chat/completions \
-H "Authorization: Bearer $THEROUTER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek/deepseek-v3.2",
"response_format": {"type": "json_object"},
"messages": [
{"role": "system", "content": "Return JSON: {\"action\":string,\"args\":object}"},
{"role": "user", "content": "Restart the nginx container on host web-03."}
]
}'Self-host
V3.2 weights are downloadable under the DeepSeek Model License (commercial use permitted). Reference inference recipes ship for SGLang, vLLM, LMDeploy, TensorRT-LLM, LightLLM, and DeepSeek-Infer in FP8 and BF16. SGLang is also supported on AMD GPUs; Huawei Ascend NPUs work via MindIE. Below is the canonical SGLang launch command from the model card.
# Reference SGLang launch (FP8). See HF model card for exact tensor-parallel sizing.
python -m sglang.launch_server \
--model deepseek-ai/DeepSeek-V3.2 \
--tp 8 \
--trust-remote-code \
--port 30000
# Then call it through TheRouter as model="deepseek/deepseek-v3.2"
# with TheRouter configured to use your self-hosted endpoint as the upstream.Fact ledger β every claim on this page traces here
| source | URL | retrieved | |
|---|---|---|---|
| Release date (V3.2 full) | arxiv.org β | 2026-05-22 | verified |
| V3.2-Exp release | api-docs.deepseek.com β | 2026-05-22 | verified |
| Architecture | arxiv.org β | 2026-05-22 | verified |
| Pretraining tokens (V3 backbone) | github.com β | 2026-05-22 | verified |
| Training cutoff | β | β | unknown |
| License β code | github.com β | 2026-05-22 | verified |
| License β weights | github.com β | 2026-05-22 | verified |
| Supported inference backends | github.com β | 2026-05-22 | verified |
| Successor | β | β | verified |
| MMLU (EM, Chat) | github.com β | 2026-05-22 | to verify |
| HumanEval-Mul (Pass@1, Chat) | github.com β | 2026-05-22 | to verify |
| MATH-500 (EM, Chat) | github.com β | 2026-05-22 | to verify |
| GSM8K (8-shot EM, Base) | github.com β | 2026-05-22 | to verify |
| GPQA-Diamond (Pass@1, Chat) | github.com β | 2026-05-22 | to verify |
| LiveCodeBench (Pass@1-COT) | github.com β | 2026-05-22 | to verify |
| AIME 2024 (Pass@1) | github.com β | 2026-05-22 | to verify |
| GPT-5 comparison (qualitative) | arxiv.org β | 2026-05-22 | to verify |
| DeepSeek publishes V3.2 technical report β same DSA architecture as V3.2-Exp, scaled post-training puts it on par with GPT-5 | arxiv.org/abs/2512.02556 β | 2026-05-22 | verified |
| DeepSeek launches V3.2-Exp β debuts Sparse Attention (DSA), API price drops over 50% | api-docs.deepseek.com β | 2026-05-22 | verified |
| What's the difference between V3.2 and V3.2-Exp? | arxiv.org β | 2026-05-22 | to verify |
| What is DeepSeek Sparse Attention (DSA)? | arxiv.org β | 2026-05-22 | to verify |
| Can I self-host DeepSeek V3.2? | github.com β | 2026-05-22 | to verify |