Gemma 4

Summary

Gemma 4 is Google DeepMind's most capable open-weight model family to date, released April 2, 2026 under Apache 2.0. It spans on-device E2B/E4B variants up to a 26B A4B MoE, with native vision and audio, 256K context windows, and fluency in 140+ languages.

Overview

Gemma 4 is Google DeepMind's open-weight model family, released April 2, 2026, under Apache 2.0 license. It is positioned as Google's most capable open models to date and is purpose-built for advanced reasoning and agentic workflows. The release notably emphasizes efficient on-device deployment alongside frontier capability — Gemma 4 is the first Gemma generation with native vision and audio in its smaller variants, fluency in 140+ languages, and 256K-token context windows across the family.

The Gemma 4 family targets the increasingly competitive open-weight tier where DeepSeek V4, Qwen 3.5, and Mistral Medium 3.5 set the pace. Apache 2.0 licensing makes Gemma 4 the most commercially permissive of the major open-weight families, and Google has invested in broad hardware support — NVIDIA Jetson Orin Nano through Blackwell, AMD GPUs via the open-source ROCm stack, and Google's own Trillium and Ironwood TPUs — to make on-device and on-prem deployment practical.

Specifications

Developer: Google DeepMind
Release Date: April 2, 2026
Type: Open-weight multimodal LLM family
License: Apache 2.0 (commercially permissive)
Variants: E2B, E4B, 31B, 26B A4B
- E2B and E4B activate ~2B and ~4B parameters during inference (effective parameter footprint optimized to preserve RAM/battery on-device)
- 26B A4B is a mixture-of-experts variant with ~4B activated parameters
Context Window: Up to 256K tokens
Modalities: Native vision and audio processing in addition to text
Languages: 140+
Hardware Support: NVIDIA Jetson Orin Nano → Blackwell; AMD GPUs via ROCm; Google Trillium and Ironwood TPUs
Access: Kaggle, Hugging Face, Google AI Studio, Google Cloud

Capabilities

On-Device Multimodal: E2B and E4B variants are designed for mobile and edge deployment with effective 2B/4B parameter footprints during inference, native audio and vision support, and 256K context windows.

Reasoning and Agentic Workflows: Tuned specifically for advanced reasoning and agentic use cases — Google's positioning explicitly targets enterprise stacks that increasingly mix proprietary and open-weight models in production.

Frontier Open-Weight Performance: At the 31B and 26B A4B variants, Gemma 4 is positioned to compete with frontier open-weight models from Meta (Llama 4 family / Muse Spark forthcoming open variant), Mistral (Medium 3.5), DeepSeek (V4-Pro / V4-Flash), and Alibaba (Qwen 3.5 family).

Multilingual Reach: 140+ languages — meaningful expansion over prior Gemma generations.

Apache 2.0 License: Most commercially permissive of the major open-weight families (Llama uses a custom license, Mistral uses Modified MIT for Medium 3.5, DeepSeek uses MIT, Qwen uses a mix). For enterprise and product builders, Apache 2.0 removes most use-case restrictions and is patent-grant-friendly.

Limitations

Gemma 4 is positioned for on-device and reasoning/agentic workflows — at the smallest sizes (E2B, E4B), it is not a substitute for frontier proprietary models like Gemini 3.1 Ultra, Claude Opus 4.7, or GPT-5.5 on the hardest tasks. The 26B A4B MoE variant is the family's high-capability play, but does not match closed-frontier performance on the most demanding benchmarks. As with any multilingual model claiming 140+ language support, performance varies meaningfully by language and task — for production use cases beyond English, validation on the specific target language is required.

Recent Developments

April 2, 2026 Launch: Released across Kaggle, Hugging Face, and Google Cloud with Apache 2.0 license. Optimized for NVIDIA Jetson Orin Nano through Blackwell, AMD ROCm, and Google Trillium/Ironwood TPUs from day one.
Strategic Positioning: Google explicitly framed Gemma 4 as competing with open-weight models from Meta, Mistral, and DeepSeek in enterprise stacks that mix proprietary and open-weight tiers.
On-Device Emphasis: E2B and E4B variants represent Google's most aggressive on-device positioning to date — sub-5B effective parameter footprint with native multimodal and 256K context.
Industry Context: Released in the same window as Mistral Medium 3.5 (April 29), DeepSeek V4 (April 24), Meta Muse Spark (April 8), and OpenAI GPT-5.5 (April 24) — making April 2026 the busiest frontier release window of the year.

Last Updated

May 7, 2026

→ Back to Models