GPT-5.5

Summary

GPT-5.5 is OpenAI's flagship agentic model, released April 24, 2026, with a 1-million-token context window — a 4x expansion over GPT-5.4. It is the first model to cross the OSWorld-V agentic-desktop human baseline at 75% (vs. 72.4% human).

Overview

GPT-5.5 is OpenAI's flagship model, released April 24, 2026, and the model that — in industry coverage — is widely cited as the moment frontier LLMs crossed credibly from "chat tools" to "autonomous digital coworkers." The signature result: 75% on the OSWorld-V agentic-desktop benchmark, modestly above the cited 72.4% human baseline, on a benchmark that requires the model to autonomously execute multi-step workflows across desktop applications. Combined with a 1-million-token context window — a 4x expansion over GPT-5.4 — GPT-5.5 became OpenAI's bid to dominate the agentic productivity workflow space.

In early May 2026, OpenAI promoted GPT-5.5 Instant as the new default ChatGPT model, putting the new long-context, agentic-workflow capabilities in front of free and Plus tiers. A higher-capability variant, GPT-5.5 Pro, is also available at substantially higher pricing for the most demanding workloads.

Specifications

  • Developer: OpenAI
  • Release Date: April 24, 2026 (API GA); GPT-5.5 Instant became default ChatGPT model May 5, 2026
  • Type: Large Language Model (LLM); agentic; multimodal
  • Context Window: 1,000,000 tokens (4x GPT-5.4's 256K)
  • Variants: GPT-5.5 (standard), GPT-5.5 Pro (highest capability), GPT-5.5 Instant (ChatGPT default)
  • Pricing:
    • GPT-5.5: $5 per million input tokens / $30 per million output tokens
    • GPT-5.5 Pro: $30 per million input tokens / $180 per million output tokens
  • Access: OpenAI Responses and Chat Completions APIs; ChatGPT (Instant variant as default)

Capabilities

1M Token Context (headline feature): ~750,000 words — equivalent to 6–8 full-length novels or codebases of tens of thousands of lines — held in a single context window.

Long-Context Retrieval: At 512K–1M tokens, GPT-5.5 scores 74.0% on long-context retrieval evaluations vs. 36.6% for GPT-5.4 — more than double, indicating that long-context performance is genuine rather than nominal.

Autonomous Desktop Workflows: Crosses the OSWorld-V human baseline at 75% (vs. 72.4% human), representing autonomous multi-step execution across desktop software — file management, application use, structured data manipulation. This is the result that drives the "frontier models are now digital coworkers" framing.

Agentic Reasoning: OpenAI describes GPT-5.5 as its smartest and most intuitive model — one that understands what a user is trying to do faster and carries more of the work itself rather than waiting for step-by-step instructions.

Multimodal: Full multimodal support — text, image, and structured input.

Limitations

At GPT-5.5 Pro's $30/$180 per million input/output tokens, the highest-capability variant is one of the most expensive frontier models in the market. The 1M context window is real but expensive to use at scale, and OpenAI's reported revisions to compute spending plans (cut from $1.4T to $600B through 2030) reflect the operational reality of serving long-context inference economically. OSWorld-V is a useful benchmark but not a full proxy for real-world desktop automation reliability — production agent deployments still require careful orchestration, human-in-the-loop fallbacks, and domain-specific tuning. Cybersecurity safeguards and refusal behaviors are evolving in response to the broader regulatory environment around frontier-model pre-release review.

Recent Developments

  • GPT-5.5 Instant Becomes ChatGPT Default (May 5, 2026): Brought 1M context and agentic-workflow capabilities to free and Plus tiers as the default ChatGPT model.
  • GPT-5.5 / GPT-5.5 Pro API Launch (April 24, 2026): Available via OpenAI Responses and Chat Completions APIs at the published pricing.
  • OSWorld-V Result: 75% — modestly above the 72.4% human baseline. Coverage frames this as the clearest sign yet of frontier models crossing into autonomous digital coworker territory.
  • Long-Context Retrieval Doubling: 74.0% vs. GPT-5.4's 36.6% at 512K–1M tokens.
  • Industry Context: Released in the same window as Anthropic's Claude Opus 4.7 (April 16, 2026), Anthropic's Mythos Preview (April 7, 2026), Meta's Muse Spark (April 8, 2026), and DeepSeek V4 (April 24, 2026) — making April 2026 the most concentrated frontier release window of the year.

Last Updated

May 7, 2026