DeepSeek V3.2

deepseekdeepseek/deepseek-v3.2

Гид по API

Chat completion

Стандартный chat через OpenAI-совместимую поверхность TheRouter. TheRouter нормализует tool-calling и response_format поверх провайдера — клиентский код остаётся переносимым между DeepSeek, Anthropic и OpenAI.

cURL

curl https://api.therouter.ai/v1/chat/completions \
  -H "Authorization: Bearer $THEROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek/deepseek-v3.2",
    "messages": [{"role": "user", "content": "Prove that the sum of two odd integers is even."}]
  }'

Стриминг

Стримит токены для чат-UI. DSA не меняет семантику стриминга — задержка первого токена сопоставима с V3.1-Terminus и существенно ниже, чем у dense-attention базелайнов на длинных подсказках.

cURL

curl https://api.therouter.ai/v1/chat/completions \
  -H "Authorization: Bearer $THEROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek/deepseek-v3.2",
    "stream": true,
    "messages": [{"role": "user", "content": "Explain MoE load balancing in 200 words."}]
  }'

Использование инструментов

DeepSeek V3.2 поддерживает function calling и tool_choice в формате OpenAI. Используйте как «рабочую лошадку» для агентов с инструментами, когда флагманская цена не оправдана.

cURL

curl https://api.therouter.ai/v1/chat/completions \
  -H "Authorization: Bearer $THEROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek/deepseek-v3.2",
    "messages": [{"role": "user", "content": "What is the AWS S3 region for ap-northeast-1?"}],
    "tools": [{
      "type": "function",
      "function": {
        "name": "lookup_region",
        "parameters": {"type": "object", "properties": {"code": {"type": "string"}}}
      }
    }]
  }'

Структурированный JSON

Через response_format ограничьте вывод JSON. Для критичного извлечения добавьте схему в system prompt — V3.2 надёжно следует явному контракту формы.

cURL

curl https://api.therouter.ai/v1/chat/completions \
  -H "Authorization: Bearer $THEROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek/deepseek-v3.2",
    "response_format": {"type": "json_object"},
    "messages": [
      {"role": "system", "content": "Return JSON: {\"action\":string,\"args\":object}"},
      {"role": "user", "content": "Restart the nginx container on host web-03."}
    ]
  }'

Self-host

Веса V3.2 можно скачать по DeepSeek Model License (коммерческое использование разрешено). Эталонные рецепты инференса есть для SGLang, vLLM, LMDeploy, TensorRT-LLM, LightLLM и DeepSeek-Infer в FP8 и BF16. SGLang работает и на AMD GPU; Huawei Ascend NPU поддерживается через MindIE. Ниже — каноническая команда запуска SGLang из карточки модели.

cURL

# Reference SGLang launch (FP8). See HF model card for exact tensor-parallel sizing.
python -m sglang.launch_server \
  --model deepseek-ai/DeepSeek-V3.2 \
  --tp 8 \
  --trust-remote-code \
  --port 30000

# Then call it through TheRouter as model="deepseek/deepseek-v3.2"
# with TheRouter configured to use your self-hosted endpoint as the upstream.

Реестр фактов — каждая утверждаемая величина имеет источник

источник	URL	получено
Дата релиза (V3.2 полная)	arxiv.org ↗	2026-05-22	проверено
Релиз V3.2-Exp	api-docs.deepseek.com ↗	2026-05-22	проверено
Архитектура	arxiv.org ↗	2026-05-22	проверено
Токены претрейна (бэкбон V3)	github.com ↗	2026-05-22	проверено
Дата отсечения данных	—	—	неизвестно
Лицензия — код	github.com ↗	2026-05-22	проверено
Лицензия — веса	github.com ↗	2026-05-22	проверено
Поддерживаемые backend инференса	github.com ↗	2026-05-22	проверено
Преемник	—	—	проверено
MMLU (EM, Chat)	github.com ↗	2026-05-22	к проверке
HumanEval-Mul (Pass@1, Chat)	github.com ↗	2026-05-22	к проверке
MATH-500 (EM, Chat)	github.com ↗	2026-05-22	к проверке
GSM8K (8-shot EM, Base)	github.com ↗	2026-05-22	к проверке
GPQA-Diamond (Pass@1, Chat)	github.com ↗	2026-05-22	к проверке
LiveCodeBench (Pass@1-COT)	github.com ↗	2026-05-22	к проверке
AIME 2024 (Pass@1)	github.com ↗	2026-05-22	к проверке
GPT-5 comparison (qualitative)	arxiv.org ↗	2026-05-22	к проверке
DeepSeek публикует тех. отчёт V3.2 — та же архитектура DSA, что и в V3.2-Exp, масштабированный post-training выводит её на уровень GPT-5	arxiv.org/abs/2512.02556 ↗	2026-05-22	проверено
DeepSeek запускает V3.2-Exp — дебют sparse attention (DSA), цена API падает более чем на 50%	api-docs.deepseek.com ↗	2026-05-22	проверено
Чем V3.2 отличается от V3.2-Exp?	arxiv.org ↗	2026-05-22	к проверке
Что такое DeepSeek Sparse Attention (DSA)?	arxiv.org ↗	2026-05-22	к проверке
Можно ли запускать DeepSeek V3.2 на своём железе?	github.com ↗	2026-05-22	к проверке