Gemma 4 is Google DeepMind's most capable open-weight model family to date, released April 2, 2026 under Apache 2.0. It spans on-device E2B/E4B variants up to a 26B A4B MoE, with native vision and audio, 256K context windows, and fluency in 140+ languages.
Gemma 4 is Google DeepMind's open-weight model family, released April 2, 2026, under Apache 2.0 license. It is positioned as Google's most capable open models to date and is purpose-built for advanced reasoning and agentic workflows. The release notably emphasizes efficient on-device deployment alongside frontier capability — Gemma 4 is the first Gemma generation with native vision and audio in its smaller variants, fluency in 140+ languages, and 256K-token context windows across the family.
The Gemma 4 family targets the increasingly competitive open-weight tier where DeepSeek V4, Qwen 3.5, and Mistral Medium 3.5 set the pace. Apache 2.0 licensing makes Gemma 4 the most commercially permissive of the major open-weight families, and Google has invested in broad hardware support — NVIDIA Jetson Orin Nano through Blackwell, AMD GPUs via the open-source ROCm stack, and Google's own Trillium and Ironwood TPUs — to make on-device and on-prem deployment practical.
On-Device Multimodal: E2B and E4B variants are designed for mobile and edge deployment with effective 2B/4B parameter footprints during inference, native audio and vision support, and 256K context windows.
Reasoning and Agentic Workflows: Tuned specifically for advanced reasoning and agentic use cases — Google's positioning explicitly targets enterprise stacks that increasingly mix proprietary and open-weight models in production.
Frontier Open-Weight Performance: At the 31B and 26B A4B variants, Gemma 4 is positioned to compete with frontier open-weight models from Meta (Llama 4 family / Muse Spark forthcoming open variant), Mistral (Medium 3.5), DeepSeek (V4-Pro / V4-Flash), and Alibaba (Qwen 3.5 family).
Multilingual Reach: 140+ languages — meaningful expansion over prior Gemma generations.
Apache 2.0 License: Most commercially permissive of the major open-weight families (Llama uses a custom license, Mistral uses Modified MIT for Medium 3.5, DeepSeek uses MIT, Qwen uses a mix). For enterprise and product builders, Apache 2.0 removes most use-case restrictions and is patent-grant-friendly.
Gemma 4 is positioned for on-device and reasoning/agentic workflows — at the smallest sizes (E2B, E4B), it is not a substitute for frontier proprietary models like Gemini 3.1 Ultra, Claude Opus 4.7, or GPT-5.5 on the hardest tasks. The 26B A4B MoE variant is the family's high-capability play, but does not match closed-frontier performance on the most demanding benchmarks. As with any multilingual model claiming 140+ language support, performance varies meaningfully by language and task — for production use cases beyond English, validation on the specific target language is required.
May 7, 2026