Mistral Large 3 is Mistral AI's Apache 2.0 open-weight flagship, released December 2, 2025 with a 41B-active / 675B-total granular MoE architecture and 256K context. At $0.50/$1.50 per million tokens, it's among the most cost-effective frontier-class open models from any major lab.
Mistral Large 3 is Mistral AI's flagship open-weight frontier model, released December 2, 2025 under the Apache 2.0 license. With 41 billion active parameters across a "granular Mixture of Experts" architecture (675 billion total parameters), it is the most capable model in the Mistral 3 family and the first open-weight model to offer genuine frontier-level performance with full multimodal and multilingual capability in a single model. At $0.50/$1.50 per million tokens, it is also one of the most cost-effective frontier-class models available from any major lab.
Mistral Large 3 is Europe's strongest answer to the US-dominated frontier AI landscape. Its Apache 2.0 license means it can be used commercially without restrictions, fine-tuned freely, and deployed on-premises — advantages that distinguish it sharply from proprietary models.
mistral-large-latest / mistral-3-large (check Mistral docs)Open-Weight Frontier Performance: Matches important capabilities of GPT-4o and Gemini 2.0 Flash at launch — competitive with closed models from US labs while being fully open source and self-hostable.
Multimodal: First Mistral model with native vision capability — handles image understanding, document analysis, and visual reasoning alongside text.
Multilingual: Strong performance across 40+ languages, positioning it as particularly valuable for non-English deployments where US models may underperform.
Granular MoE Architecture: The "granular" MoE design provides more fine-grained expert routing than standard MoE approaches, contributing to both efficiency and quality.
Cost Efficiency: At $0.50/$1.50, one of the best price/performance ratios for a frontier-class model. Combined with Apache 2.0 licensing, total cost of ownership for self-hosted deployments is extremely low.
Code: Complemented by Codestral for coding-specific tasks, but Mistral Large 3 itself has strong general coding capability.
Benchmark performance against the Gemini 3 and Claude 4-series models (released after Mistral Large 3) shows a gap — Mistral has not published direct comparisons against these newer models. The 256K context window, while generous, is smaller than the 1M+ options now standard among Anthropic, OpenAI, and Google flagship models.
February 26, 2026