Gemini 2.5 Flash-Lite is Google DeepMind's lowest-cost, lowest-latency model, released September 25, 2025. At $0.10 input / $0.40 output per million tokens with a 1M+ context window, it targets high-volume, latency-sensitive multimodal workloads.
Gemini 2.5 Flash-Lite is Google DeepMind's lowest-cost, lowest-latency model in the 2.5 family, released September 25, 2025. At $0.10 per million input tokens and $0.40 per million output tokens, it is one of the cheapest frontier-adjacent models available from any major lab — nearly 50x cheaper per input token than Anthropic's Claude Opus 4.6, and 17x cheaper than Claude Haiku 4.5. It is positioned as a cost-effective upgrade from the older Gemini 1.5 and 2.0 Flash models for high-volume, latency-sensitive applications where cost is the primary constraint.
Flash-Lite is the right choice for applications that need to process enormous volumes of requests — content classification, bulk document processing, real-time chat at massive scale, or sub-agent tasks in large multi-agent pipelines where the individual subtask doesn't warrant a stronger model.
gemini-2.5-flash-lite (check Google AI docs for versioned string)Exceptional Cost Efficiency: At $0.10/$0.40 per million tokens with a 1M context window, Flash-Lite offers a combination of context size and price that is unmatched in the current market from major labs.
Low Latency: Designed for the fastest possible response times — suitable for real-time, interactive applications and high-throughput pipelines.
1M Token Context: Despite being the cheapest model in the lineup, Flash-Lite retains the full 1M token context window of its siblings — making it practical for large-document tasks at minimal cost.
Multimodal: Handles text, image, and video inputs at the same price point.
As the most cost-efficient model, Flash-Lite trades capability depth for price and speed. For complex reasoning, nuanced writing, or tasks requiring sustained analytical quality, Gemini 3 Flash or Gemini 3 Pro will produce better results. It is not a reasoning model. Benchmark performance is below the Gemini 3 series on all major evaluations.
February 26, 2026