AI Models

Compare ChatGPT, Claude, Gemini, and the rest of the foundation models — filter by license, provider, or modality.

Claude 3.5 Sonnet

Claude 3.5 Sonnet is the latest frontier large language model from Anthropic, representing a significant leap in AI capabilities at production-friendly pricing.

Claude Haiku 4.5

Claude Haiku 4.5 is Anthropic's fastest and most cost-efficient model, designed for high-volume deployments where speed and low cost matter more than maximum capability.

Claude Mythos Preview

Claude Mythos Preview is Anthropic's frontier-tier model, announced April 7, 2026, and made available via restricted partner access through Project Glasswing rather than general availability.

Claude Opus 4.5

Claude Opus 4.5 is Anthropic's most capable model at the time of its November 2025 release — the first AI model to break the 80% threshold on SWE-bench Verified, scoring 80.9% on one of the field's most demanding real-world coding benchmarks.

Claude Opus 4.6

Claude Opus 4.6 is Anthropic's most capable model as of early 2026, released February 5, 2026.

Claude Opus 4.7

Claude Opus 4.7 is Anthropic's flagship generally-available model as of mid-2026, released April 16, 2026. It builds on Opus 4.6's 1M-context and Agent Teams foundation by delivering substantial improvements in two specific dimensions: advanced software engineering and vision.

Claude Sonnet 4.5

Claude Sonnet 4.5 is Anthropic's balanced flagship model, released September 29, 2025. It sits in the sweet spot between Haiku's speed and Opus's capability — delivering state-of-the-art performance on real-world agentic tasks, coding, and computer use at a price point accessible for production deployment.

Claude Sonnet 4.6

Claude Sonnet 4.6 is Anthropic's current recommended default model for most use cases, released February 17, 2026. It builds directly on Sonnet 4.5's strengths in agentic tasks and coding, adding improved performance on workplace tasks such as spreadsheets, documents, and structured data work, along with access to a 1M token context window via beta header.

Codestral

Codestral is Mistral AI's code-specialized model — purpose-built for software engineering tasks including code generation, completion, debugging, and explanation. It is the only Mistral model released under a non-Apache license. Codestral is available via Mistral's API and has been integrated into popular developer tools including VS Code and JetBrains IDEs.

Cohere Command A

Cohere Command A is Cohere's flagship enterprise model, an open-weight 111B-parameter model with a 256K-token context window optimized for agentic AI, multilingual workloads, and coding use cases. Where Command R+ established Cohere as a leading enterprise RAG and tool-use provider.

DeepSeek R1

DeepSeek R1 is DeepSeek's reasoning-specialized model, released January 20, 2025 — part of the same release event that introduced the world to DeepSeek's extraordinarily cost-efficient approach to training frontier AI.

DeepSeek V3.1

DeepSeek V3.1 is DeepSeek's current general-purpose flagship model, released August 2025 by the Chinese AI lab (backed by High-Flyer hedge fund) that shocked the AI world in January 2025 with its original V3 release.

DeepSeek V4-Flash

DeepSeek V4-Flash is DeepSeek's fast, cost-efficient frontier MoE model, released April 23, 2026. With 284B total parameters and just 13B activated per token, V4-Flash has the smallest activation footprint among Tier-1 open-weight models.

DeepSeek V4-Pro

DeepSeek V4-Pro is the flagship variant of DeepSeek's V4 generation, released April 24, 2026, alongside the smaller V4-Flash variant.

Doubao 2.0 Pro

Doubao 2.0 Pro is the flagship variant of ByteDance's Doubao-Seed-2.0 model family, released February 14, 2026. Doubao 2.0 represents ByteDance's strategic pivot from passive chatbot to autonomous agent, with the Doubao-Seed-2.0 architecture explicitly designed for autonomous multi-step task execution.

FLUX 1.2 Pro Ultra

FLUX 1.2 Pro Ultra is Black Forest Labs' flagship image generation model, released February 2026. It produces 4MP images (up to 2752×2752) at 10x the speed of FLUX.1 Pro, with Ultra and Raw generation modes and distribution via the BFL API and AWS Bedrock Frankfurt.

Gemini 2.5 Flash

Gemini 2.5 Flash is Google DeepMind's previous-generation fast multimodal model, released June 17, 2025 — offering a 1M+ context window at $0.30/$2.50 per million tokens with an optional thinking mode. Now superseded by Gemini 3 Flash but still widely used in production.

Gemini 2.5 Flash-Lite

Gemini 2.5 Flash-Lite is Google DeepMind's lowest-cost, lowest-latency model, released September 25, 2025. At $0.10 input / $0.40 output per million tokens with a 1M+ context window, it targets high-volume, latency-sensitive multimodal workloads.

Gemini 3 Flash

Gemini 3 Flash is Google DeepMind's fast, balanced mid-tier model, released December 17, 2025 — delivering frontier-level reasoning (90.4% on GPQA Diamond, 33.7% on Humanity's Last Exam without tools) at Flash-tier speed and cost with a 1M+ token context window.

Gemini 3 Pro

Gemini 3 Pro is Google DeepMind's first Gemini 3 series flagship, released November 18, 2025 with native multimodality across text, audio, images, video, and code in a 1M+ context window. Since superseded on reasoning benchmarks by Gemini 3.1 Pro, but still widely used.

Gemini 3.1 Pro

Gemini 3.1 Pro is Google DeepMind's flagship model as of February 2026, released February 19, 2026 with a 1M+ token context. It scores a publicly leading 77.1% on ARC-AGI-2 — more than double Gemini 3 Pro's result — and reasons natively across text, audio, images, video, and entire code repositories.

Gemini 3.1 Ultra

Gemini 3.1 Ultra is Google DeepMind's top-of-line frontier model, scoring a record 94.3% on GPQA Diamond. Its architectural centerpiece — Chain-of-Verification (CoVe) — generates and tests sub-hypotheses at inference time, reducing hallucinations by 60%+ on technical and scientific output versus Gemini 2.0.

Gemini Embedding 2

Gemini Embedding 2 is Google DeepMind's first natively multimodal embedding model, mapping text, images, video, audio, and documents into a single 3072-dimensional unified space across 100+ languages. Generally available via the Gemini API and Vertex AI as of late April / early May 2026.

Gemma 4

Gemma 4 is Google DeepMind's most capable open-weight model family to date, released April 2, 2026 under Apache 2.0. It spans on-device E2B/E4B variants up to a 26B A4B MoE, with native vision and audio, 256K context windows, and fluency in 140+ languages.

GLM-5

GLM-5 is Z.ai's MIT-licensed 744B-parameter MoE frontier model with 40B active per token, released February 11, 2026 — the first frontier-tier model trained end-to-end on Huawei Ascend silicon. It scores 77.8% on SWE-bench Verified at API pricing roughly 5–8x cheaper than Claude Opus 4.6.

GLM-5.1

GLM-5.1 is Z.ai's spring 2026 iterative upgrade to GLM-5, open-sourced under MIT license on April 8, 2026. It retains the 744B/40B-active MoE architecture and Huawei Ascend training stack while improving reasoning, agentic workflows, and multilingual coverage.

GPT-4.1

GPT-4.1 is OpenAI's best non-reasoning model — featuring a 1M token context window with best-in-class instruction following and tool calling. Released in 2025, it is the preferred workhorse for latency-sensitive production workloads.

GPT-4.1 mini

GPT-4.1 mini is OpenAI's cost-efficient workhorse, released April 14, 2025 — matching or exceeding GPT-4o's intelligence while cutting latency nearly in half and reducing cost by roughly 83%. It retains GPT-4.1's 1M token context window and strong instruction-following.

GPT-5.2

GPT-5.2 is OpenAI's flagship model, released December 10, 2025 — available in Instant, Thinking, and Pro modes with a 400K-token context. GPT-5.2 Thinking beats top professionals on 70.9% of GDPval tasks and the model crossed 90% on ARC-AGI-1.

GPT-5.2 Codex

GPT-5.2 Codex is OpenAI's coding-specialized model, released January 14, 2026, with a 400K token context window and strong tool-calling support. It is OpenAI's recommended model for software engineering tasks including architectural reasoning, debugging, code review, and coding agents.

GPT-5.5

GPT-5.5 is OpenAI's flagship agentic model, released April 24, 2026, with a 1-million-token context window — a 4x expansion over GPT-5.4. It is the first model to cross the OSWorld-V agentic-desktop human baseline at 75% (vs. 72.4% human).

GPT-5.5 Instant

GPT-5.5 Instant is OpenAI's new default ChatGPT model, released May 5, 2026, replacing GPT-5.3 Instant with 52.5% fewer hallucinations and ~30% more concise responses. It introduces personalization across past chats, files, and connected Gmail at roughly 80–85% of standard GPT-5.5's quality.

GPT-5.5-Cyber

GPT-5.5-Cyber is a specialized variant of [[OpenAI/GPT-5.5|GPT-5.5]] tuned for authorized cybersecurity workflows, released May 7, 2026 in limited preview to a vetted group of cybersecurity teams responsible for securing critical infrastructure.

GPT-Realtime-2

GPT-Realtime-2 is OpenAI's first realtime voice model with GPT-5-class reasoning, released May 7, 2026 in the OpenAI Realtime API.

GPT-Realtime-Translate

GPT-Realtime-Translate is OpenAI's purpose-built live speech translation model in the Realtime API, released May 7, 2026 alongside GPT-Realtime-2 and GPT-Realtime-Whisper.

Grok 3

Grok 3 is xAI's flagship multimodal LLM, released February 19, 2025 with real-time X platform integration and priced at $3/$15 per million tokens. It powers the Grok assistant for X Premium+ subscribers and xAI's military and enterprise deployments.

Grok 3 Mini

Grok 3 Mini is xAI's cost-efficient reasoning-capable model, released June 10, 2025 at $0.30/$0.50 per million tokens (~90% cheaper than Grok 3). It scores 57 on the Artificial Analysis Intelligence Index — beating Grok 3 full's 45 — and retains real-time X platform access.

Grok 4.3

Grok 4.3 is xAI's frontier reasoning model, released May 1, 2026, ranking #1 on CaseLaw v2 (79.3%) and CorpFin benchmarks while scoring 98% on τ²-Bench Telecom. At $1.25/$2.50 per million tokens, it undercuts Claude Sonnet 4.6's input cost by roughly 5x.

Grok Imagine

Grok Imagine is xAI's flagship text-to-video and image-to-video model with synchronized audio — ranked #1 on public text-to-video leaderboards as of May 2026. v1.0 (Feb 2026) delivers 10-second 720p clips, and Extend from Frame (Mar 2026) chains 15-second continuous sequences.

Hailuo 2.3

Hailuo 2.3 is MiniMax's flagship video generation model, released Spring 2026 in four variants — Standard, Pro, Fast, and Fast Pro — with improved physical action, stylization, and character micro-expressions. The Fast variant cuts batch creation costs by up to 50%.

Hunyuan T1

Hunyuan T1 is Tencent's flagship reasoning model, officially released mid-February 2026 — billed as the world's first ultra-large-scale Hybrid-Mamba-Transformer MoE. It scores 87.2 on MMLU-PRO and 96.2 on MATH-500 and is API-only via Tencent Cloud.

Kimi K2.6

Kimi K2.6 is Moonshot AI's flagship open-weight agentic large language model, released April 20, 2026 under a Modified MIT license. It is a native multimodal model built on a 1-trillion parameter Mixture-of-Experts (MoE) architecture, with 32 billion parameters activated per token. K2.6's distinguishing technical contribution is "Agent Swarm" — a multi-agent orchestration architecture built directly into the model that scales to 300 domain-specialized sub-agents executing up to 4,000 coordinated steps in a single autonomous run, up from 100 sub-agents and 1,500 steps in K2.5.

Kling 2.6

Kling 2.6 is Kuaishou Technology's flagship video generation model, released December 3, 2025, and the first model from Kuaishou's Kling AI division to generate synchronized audio and video in a single forward pass. Where prior video generation models (Veo 3, Runway Gen-4, earlier Kling versions) require either silent video or post-hoc audio integration, Kling 2.6 produces character dialogue, singing, ambient sound effects, and music together with the video output.

Llama 4 Behemoth

Llama 4 Behemoth is Meta's largest disclosed model — 288B active / ~2T total parameters — announced April 2025 and serving primarily as a teacher model used to distill knowledge into Scout and Maverick. Available in limited preview as of early 2026.

Llama 4 Maverick

Llama 4 Maverick is Meta's flagship open-weight MoE model, released April 5, 2025 — 17B active / 400B total parameters across 128 experts, with a 1M token context window and native multimodality. The first open-weight model competitive with GPT-4o and Gemini 2.0 Flash at launch.

Llama 4 Scout

Llama 4 Scout is Meta's context-efficient open-weight model, released April 5, 2025 — a 17B active / 109B total MoE across 16 experts with a 10M token context window, the largest of any publicly available model at launch, fitting on a single NVIDIA H100.

Midjourney V8.1

Midjourney V8.1 is Midjourney's fastest text-to-image model to date, released April 30, 2026 on midjourney.com. It extends Midjourney's signature aesthetic frontier while inheriting V7's Draft Mode (10x faster generation) and Omni Reference for improved character and object consistency.

MiniMax M2.7

MiniMax-M2.7 is MiniMax's open-weight agentic frontier text model, released March 18, 2026 — a 230B-parameter MoE with only ~10B active per token. It sustains 97% skill adherence across 40 complex skills and scores SOTA among open-weight models on SWE-Pro (56.22%) and GDPval-AA (ELO 1495).

Mistral Large 3

Mistral Large 3 is Mistral AI's Apache 2.0 open-weight flagship, released December 2, 2025 with a 41B-active / 675B-total granular MoE architecture and 256K context. At $0.50/$1.50 per million tokens, it's among the most cost-effective frontier-class open models from any major lab.

Mistral Medium 3

Mistral Medium 3 is Mistral AI's Apache 2.0 mid-tier model, released May 7, 2025 with a 131K context window at $0.40/$2.00 per million tokens. It sits between the Small 3-series and the frontier-level Large 3 for balanced production workloads.

Mistral Medium 3.5

Mistral Medium 3.5 is Mistral AI's frontier-class mid-tier multimodal model, released April 29, 2026 as open weights under a Modified MIT license. It scores 77.6% on SWE-Bench Verified — ahead of Devstral 2 and Qwen3.5 397B A17B — at aggressive $1.50/$7.50 per million-token pricing.

Muse Spark

Muse Spark is the first flagship LLM from Meta Superintelligence Labs, released April 8, 2026 under CAIO Alexandr Wang's leadership. It accepts voice, text, and image inputs (text-only output) and marks Meta's strategic shift from Llama's open-weight default to a closed-source flagship with a separate open variant planned.

o3-mini

o3-mini is OpenAI's previous-generation small reasoning model — chain-of-thought reasoning for math, science, and coding at $1.10/$4.40 per million tokens. Released early 2025, still available though superseded by o4-mini.

o4-mini

o4-mini is OpenAI's compact reasoning model — successor to o3-mini, optimized for coding and visual reasoning. With a 200K context window, 100K max output, and chain-of-thought reasoning, it brings cost-efficient inference to production agentic workflows.

Perplexity Sonar

Perplexity Sonar is the search-grounded LLM family powering Perplexity's AI-native search, built on Llama 3.3 with proprietary retrieval orchestration. The lineup spans Sonar, Sonar Pro, Sonar Reasoning, and Online Models — optimized for citation-first synthesis at $1–$3/$15 per million tokens.

pi-zero

π0 (pi-zero) is Physical Intelligence's open-source vision-language-action robot foundation model — a VLM backbone with a flow-matching action head producing 50 Hz trajectories. Trained on 10,000+ hours across 7 robotic platforms and 68 tasks, it has seeded a major open-source robotics ecosystem.

Pika 2.5

Pika 2.5 is Pika Labs' consumer-first video generation model — adding more reliable styles and stronger character consistency over Pika 2.0 while retaining Scene Ingredients (user-uploaded reference content) and a TikTok-style mobile app for 500,000+ Gen Z creators.

Qwen 3.5

Qwen 3.5 is Alibaba Cloud's open-weight multimodal model family, first released February 16, 2026 with a 397B-A17B MoE flagship and 201-language coverage. Nine variants shipped in 16 days, plus a proprietary Qwen 3.5-Omni adding audio and real-time interaction in April 2026.

Qwen 3.6

Qwen 3.6 is Alibaba Cloud's April 2026 multi-variant open-weight family — Plus, 35B-A3B, Max-Preview (the line's first proprietary flagship), and 27B — pushing harder into agentic coding, with the 27B reportedly outperforming the 397B-A17B Qwen 3.5 MoE on agentic coding benchmarks.

Ray3

Ray3 is Luma AI's flagship video generation model, released March 2026 — billed as the world's first reasoning video model with native studio-grade HDR and 16-bit EXR export. The Ray3.14 update brought native 1080p at 4x faster speed and 3x lower cost.

Runway Gen-4.5

Runway Gen-4.5 is Runway's studio-grade video generation flagship, released December 2025 — top-rated on Video Arena ahead of Veo 3.1 and Kling 3.0. It emphasizes world-consistency, controllable action, and tight integration with Runway's creative studio (Frames, Act-One, editorial tools).

SAM Audio

SAM Audio is Meta AI's first unified foundation model for audio source separation, released December 16, 2025. It isolates any sound from a complex mixture using text, visual, or temporal prompts (or combinations) and ships open-source across Small, Base, and Large variants.

Stable Diffusion 3.5

Stable Diffusion 3.5 is Stability AI's flagship open-weight image generation family, released October 2024 — Large (8B), Large Turbo (8B/4-step), and Medium (2.5B) variants under the Stability AI Community License. Remains the foundational reference for the open-weight ecosystem of fine-tunes and tooling.

Veo 4

Veo 4 is Google DeepMind's flagship video generation model, released April 2026 as the successor to Veo 3.1. It generates 15-30 second clips at native 4K resolution and 120 fps with storyboard-based prompting, character consistency, native audio integration, and personalized avatars.

Voxtral TTS

Voxtral TTS is Mistral AI's first dedicated text-to-speech model, open-sourced March 23, 2026. It covers 9 languages with 3-second zero-shot voice cloning, using a hybrid autoregressive plus flow-matching architecture designed to close the prosody gap with ElevenLabs and OpenAI's voice stack.

Wan 2.7

Wan 2.7 is Alibaba Tongyi Lab's flagship open-source video generation model, released late March 2026. It produces 15-second 1080p clips with industry-first first/last-frame control, native synchronized audio, multi-reference conditioning (up to 5 videos), and instruction-based editing.

Yi-Lightning

Yi-Lightning is 01.AI's most-publicized model, launched October 2024 under Kai-Fu Lee's framing of "GPT-4 quality at 500x lower cost." It was an early articulation of the Chinese cost-efficiency thesis that DeepSeek and others would amplify through 2025–2026.