DeepSeek V3.2

deepseekdeepseek/deepseek-v3.2

API guide

Chat completion

Standard chat through TheRouter's OpenAI-compatible surface. TheRouter normalises tool-calling and response_format on top of the underlying provider — your client code stays portable across DeepSeek, Anthropic, and OpenAI.

cURL

curl https://api.therouter.ai/v1/chat/completions \
  -H "Authorization: Bearer $THEROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek/deepseek-v3.2",
    "messages": [{"role": "user", "content": "Prove that the sum of two odd integers is even."}]
  }'

Streaming

Stream tokens for chat UIs. DSA does not affect streaming semantics — first-token latency is competitive with V3.1-Terminus and substantially lower than dense-attention baselines on long prompts.

cURL

curl https://api.therouter.ai/v1/chat/completions \
  -H "Authorization: Bearer $THEROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek/deepseek-v3.2",
    "stream": true,
    "messages": [{"role": "user", "content": "Explain MoE load balancing in 200 words."}]
  }'

Tool use

DeepSeek V3.2 supports OpenAI-shape function calling and tool_choice. Use it as the workhorse model behind tool-using agents where flagship pricing isn't justified.

cURL

curl https://api.therouter.ai/v1/chat/completions \
  -H "Authorization: Bearer $THEROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek/deepseek-v3.2",
    "messages": [{"role": "user", "content": "What is the AWS S3 region for ap-northeast-1?"}],
    "tools": [{
      "type": "function",
      "function": {
        "name": "lookup_region",
        "parameters": {"type": "object", "properties": {"code": {"type": "string"}}}
      }
    }]
  }'

Structured JSON

Constrain output to JSON via response_format. For high-stakes extraction, also include a schema in the system prompt — V3.2 follows explicit shape contracts reliably.

cURL

curl https://api.therouter.ai/v1/chat/completions \
  -H "Authorization: Bearer $THEROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek/deepseek-v3.2",
    "response_format": {"type": "json_object"},
    "messages": [
      {"role": "system", "content": "Return JSON: {\"action\":string,\"args\":object}"},
      {"role": "user", "content": "Restart the nginx container on host web-03."}
    ]
  }'

Self-host

V3.2 weights are downloadable under the DeepSeek Model License (commercial use permitted). Reference inference recipes ship for SGLang, vLLM, LMDeploy, TensorRT-LLM, LightLLM, and DeepSeek-Infer in FP8 and BF16. SGLang is also supported on AMD GPUs; Huawei Ascend NPUs work via MindIE. Below is the canonical SGLang launch command from the model card.

cURL

# Reference SGLang launch (FP8). See HF model card for exact tensor-parallel sizing.
python -m sglang.launch_server \
  --model deepseek-ai/DeepSeek-V3.2 \
  --tp 8 \
  --trust-remote-code \
  --port 30000

# Then call it through TheRouter as model="deepseek/deepseek-v3.2"
# with TheRouter configured to use your self-hosted endpoint as the upstream.

Fact ledger — every claim on this page traces here

source	URL	retrieved
Release date (V3.2 full)	arxiv.org ↗	2026-05-22	verified
V3.2-Exp release	api-docs.deepseek.com ↗	2026-05-22	verified
Architecture	arxiv.org ↗	2026-05-22	verified
Pretraining tokens (V3 backbone)	github.com ↗	2026-05-22	verified
Training cutoff	—	—	unknown
License — code	github.com ↗	2026-05-22	verified
License — weights	github.com ↗	2026-05-22	verified
Supported inference backends	github.com ↗	2026-05-22	verified
Successor	—	—	verified
MMLU (EM, Chat)	github.com ↗	2026-05-22	to verify
HumanEval-Mul (Pass@1, Chat)	github.com ↗	2026-05-22	to verify
MATH-500 (EM, Chat)	github.com ↗	2026-05-22	to verify
GSM8K (8-shot EM, Base)	github.com ↗	2026-05-22	to verify
GPQA-Diamond (Pass@1, Chat)	github.com ↗	2026-05-22	to verify
LiveCodeBench (Pass@1-COT)	github.com ↗	2026-05-22	to verify
AIME 2024 (Pass@1)	github.com ↗	2026-05-22	to verify
GPT-5 comparison (qualitative)	arxiv.org ↗	2026-05-22	to verify
DeepSeek publishes V3.2 technical report — same DSA architecture as V3.2-Exp, scaled post-training puts it on par with GPT-5	arxiv.org/abs/2512.02556 ↗	2026-05-22	verified
DeepSeek launches V3.2-Exp — debuts Sparse Attention (DSA), API price drops over 50%	api-docs.deepseek.com ↗	2026-05-22	verified
What's the difference between V3.2 and V3.2-Exp?	arxiv.org ↗	2026-05-22	to verify
What is DeepSeek Sparse Attention (DSA)?	arxiv.org ↗	2026-05-22	to verify
Can I self-host DeepSeek V3.2?	github.com ↗	2026-05-22	to verify