Back to Models

DeepSeek V3.2

deepseekdeepseek/deepseek-v3.2

API guide

Chat completion

Standard chat through TheRouter's OpenAI-compatible surface. TheRouter normalises tool-calling and response_format on top of the underlying provider β€” your client code stays portable across DeepSeek, Anthropic, and OpenAI.

cURL
curl https://api.therouter.ai/v1/chat/completions \
  -H "Authorization: Bearer $THEROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek/deepseek-v3.2",
    "messages": [{"role": "user", "content": "Prove that the sum of two odd integers is even."}]
  }'

Streaming

Stream tokens for chat UIs. DSA does not affect streaming semantics β€” first-token latency is competitive with V3.1-Terminus and substantially lower than dense-attention baselines on long prompts.

cURL
curl https://api.therouter.ai/v1/chat/completions \
  -H "Authorization: Bearer $THEROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek/deepseek-v3.2",
    "stream": true,
    "messages": [{"role": "user", "content": "Explain MoE load balancing in 200 words."}]
  }'

Tool use

DeepSeek V3.2 supports OpenAI-shape function calling and tool_choice. Use it as the workhorse model behind tool-using agents where flagship pricing isn't justified.

cURL
curl https://api.therouter.ai/v1/chat/completions \
  -H "Authorization: Bearer $THEROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek/deepseek-v3.2",
    "messages": [{"role": "user", "content": "What is the AWS S3 region for ap-northeast-1?"}],
    "tools": [{
      "type": "function",
      "function": {
        "name": "lookup_region",
        "parameters": {"type": "object", "properties": {"code": {"type": "string"}}}
      }
    }]
  }'

Structured JSON

Constrain output to JSON via response_format. For high-stakes extraction, also include a schema in the system prompt β€” V3.2 follows explicit shape contracts reliably.

cURL
curl https://api.therouter.ai/v1/chat/completions \
  -H "Authorization: Bearer $THEROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek/deepseek-v3.2",
    "response_format": {"type": "json_object"},
    "messages": [
      {"role": "system", "content": "Return JSON: {\"action\":string,\"args\":object}"},
      {"role": "user", "content": "Restart the nginx container on host web-03."}
    ]
  }'

Self-host

V3.2 weights are downloadable under the DeepSeek Model License (commercial use permitted). Reference inference recipes ship for SGLang, vLLM, LMDeploy, TensorRT-LLM, LightLLM, and DeepSeek-Infer in FP8 and BF16. SGLang is also supported on AMD GPUs; Huawei Ascend NPUs work via MindIE. Below is the canonical SGLang launch command from the model card.

cURL
# Reference SGLang launch (FP8). See HF model card for exact tensor-parallel sizing.
python -m sglang.launch_server \
  --model deepseek-ai/DeepSeek-V3.2 \
  --tp 8 \
  --trust-remote-code \
  --port 30000

# Then call it through TheRouter as model="deepseek/deepseek-v3.2"
# with TheRouter configured to use your self-hosted endpoint as the upstream.
Fact ledger β€” every claim on this page traces here
sourceURLretrieved
Release date (V3.2 full)arxiv.org β†—2026-05-22verified
V3.2-Exp releaseapi-docs.deepseek.com β†—2026-05-22verified
Architecturearxiv.org β†—2026-05-22verified
Pretraining tokens (V3 backbone)github.com β†—2026-05-22verified
Training cutoffβ€”β€”unknown
License β€” codegithub.com β†—2026-05-22verified
License β€” weightsgithub.com β†—2026-05-22verified
Supported inference backendsgithub.com β†—2026-05-22verified
Successorβ€”β€”verified
MMLU (EM, Chat)github.com β†—2026-05-22to verify
HumanEval-Mul (Pass@1, Chat)github.com β†—2026-05-22to verify
MATH-500 (EM, Chat)github.com β†—2026-05-22to verify
GSM8K (8-shot EM, Base)github.com β†—2026-05-22to verify
GPQA-Diamond (Pass@1, Chat)github.com β†—2026-05-22to verify
LiveCodeBench (Pass@1-COT)github.com β†—2026-05-22to verify
AIME 2024 (Pass@1)github.com β†—2026-05-22to verify
GPT-5 comparison (qualitative)arxiv.org β†—2026-05-22to verify
DeepSeek publishes V3.2 technical report β€” same DSA architecture as V3.2-Exp, scaled post-training puts it on par with GPT-5arxiv.org/abs/2512.02556 β†—2026-05-22verified
DeepSeek launches V3.2-Exp β€” debuts Sparse Attention (DSA), API price drops over 50%api-docs.deepseek.com β†—2026-05-22verified
What's the difference between V3.2 and V3.2-Exp?arxiv.org β†—2026-05-22to verify
What is DeepSeek Sparse Attention (DSA)?arxiv.org β†—2026-05-22to verify
Can I self-host DeepSeek V3.2?github.com β†—2026-05-22to verify
Customer Support