Back to Models

DeepSeek V3.2

deepseekdeepseek/deepseek-v3.2

Specifications

Release date (V3.2 full)2025-12-02 (arXiv tech report)arxiv.org β†—verified
V3.2-Exp releaseSeptember 2025 (experimental, DSA debut)api-docs.deepseek.com β†—verified
Architecture671B-total / 37B-active MoE; MLA (Multi-Head Latent Attention) + DSA (DeepSeek Sparse Attention); RoPE; multi-token prediction training objectivearxiv.org β†—verified
Pretraining tokens (V3 backbone)14.8 trilliongithub.com β†—verified
Training cutoffNot publicly disclosedunknown

Benchmarks

BenchmarkDistributionScoreSource
MMLU (EM, Chat)
From the V3 README β€” V3.2 benchmarks reported as on par with V3.1-Terminus, so this V3-baseline carries over within noise.
88.5github.com β†—
HumanEval-Mul (Pass@1, Chat)
82.6github.com β†—
MATH-500 (EM, Chat)
90.2github.com β†—
GSM8K (8-shot EM, Base)
89.3github.com β†—
GPQA-Diamond (Pass@1, Chat)
59.1github.com β†—
LiveCodeBench (Pass@1-COT)
40.5github.com β†—
AIME 2024 (Pass@1)
39.2github.com β†—
GPT-5 comparison (qualitative)
Per the V3.2 tech report: with scaled post-training compute V3.2 performs comparably to GPT-5; the high-compute Speciale variant surpasses GPT-5 on reasoning and earned gold-medal scores at IMO/IOI/ICPC/CMO 2025.
Comparable to GPT-5; Speciale variant exceeds GPT-5arxiv.org β†—
Fact ledger β€” every claim on this page traces here
sourceURLretrieved
Release date (V3.2 full)arxiv.org β†—2026-05-22verified
V3.2-Exp releaseapi-docs.deepseek.com β†—2026-05-22verified
Architecturearxiv.org β†—2026-05-22verified
Pretraining tokens (V3 backbone)github.com β†—2026-05-22verified
Training cutoffβ€”β€”unknown
License β€” codegithub.com β†—2026-05-22verified
License β€” weightsgithub.com β†—2026-05-22verified
Supported inference backendsgithub.com β†—2026-05-22verified
Successorβ€”β€”verified
MMLU (EM, Chat)github.com β†—2026-05-22to verify
HumanEval-Mul (Pass@1, Chat)github.com β†—2026-05-22to verify
MATH-500 (EM, Chat)github.com β†—2026-05-22to verify
GSM8K (8-shot EM, Base)github.com β†—2026-05-22to verify
GPQA-Diamond (Pass@1, Chat)github.com β†—2026-05-22to verify
LiveCodeBench (Pass@1-COT)github.com β†—2026-05-22to verify
AIME 2024 (Pass@1)github.com β†—2026-05-22to verify
GPT-5 comparison (qualitative)arxiv.org β†—2026-05-22to verify
DeepSeek publishes V3.2 technical report β€” same DSA architecture as V3.2-Exp, scaled post-training puts it on par with GPT-5arxiv.org/abs/2512.02556 β†—2026-05-22verified
DeepSeek launches V3.2-Exp β€” debuts Sparse Attention (DSA), API price drops over 50%api-docs.deepseek.com β†—2026-05-22verified
What's the difference between V3.2 and V3.2-Exp?arxiv.org β†—2026-05-22to verify
What is DeepSeek Sparse Attention (DSA)?arxiv.org β†—2026-05-22to verify
Can I self-host DeepSeek V3.2?github.com β†—2026-05-22to verify
Customer Support