DeepSeek V4-Flash is DeepSeek's fast, cost-efficient frontier MoE model, released April 23, 2026. With 284B total parameters and just 13B activated per token, V4-Flash has the smallest activation footprint among Tier-1 open-weight models.
DeepSeek V4-Flash is DeepSeek's fast, cost-efficient frontier MoE model, released April 23, 2026 — one day ahead of the heavier-weight [[DeepSeek/DeepSeek V4-Pro|DeepSeek V4-Pro]] preview. It is not a distillation of V4-Pro: V4-Flash is its own pretraining run (32T+ tokens) on the same V4 architecture family, sized for the "intelligence-per-parameter" sweet spot. With 284B total parameters and just 13B activated per token, V4-Flash has the smallest activation footprint among Tier-1 open-weight models.
V4-Flash extends DeepSeek's defining open-weight strategy: MIT-licensed, full weights on Hugging Face, and pricing that further compresses the cost gap with closed frontier models. A DeepSeek-V4-Flash-Max variant (also April 23, 2026) targets workflows that need a larger "thinking budget" while staying within the Flash family.
deepseek-ai/DeepSeek-V4-Flash), self-hostedBest-in-Class Activation Efficiency: 13B active parameters per token is the lowest among Tier-1 open-weight models — meaning lower per-token inference compute and lower hosting cost than peers at comparable benchmark tier.
1M-Token Context: Full 1M-token context window — at parity with Google's Gemini family and ahead of most open-weight peers.
Reasoning Performance Near Pro: With a larger thinking budget, V4-Flash-Max approaches V4-Pro reasoning performance; smaller parameter scale naturally trails Pro on the most complex agentic workflows and pure knowledge tasks.
Open-Weight Math/STEM/Coding Leadership: V4 family beats all current open-weight models in Math/STEM/Coding and rivals top closed-source models. V4-Pro hits LiveCodeBench 93.5 and Codeforces ELO 3206 (ahead of GPT-5.5 at 3168); V4-Flash inherits the family's strong coding profile at lower cost.
Fully MIT Licensed: Unlike MiniMax M2.7, V4-Flash is freely usable commercially — making it the most production-friendly open-weight frontier model in this generation.
V4-Flash trails V4-Pro on the most complex agentic workflows and the hardest knowledge benchmarks — this is the explicit tradeoff for its lower activation footprint and lower cost. As with prior DeepSeek releases, U.S. enterprise deployment in regulated industries continues to face additional scrutiny around Chinese-origin model provenance and hosted-API data handling. Self-hosting requires substantial GPU memory even for the activated subset.
May 9, 2026