Claude 3.5 Sonnet

Summary

Claude 3.5 Sonnet is the latest frontier large language model from Anthropic, representing a significant leap in AI capabilities at production-friendly pricing.

Overview

Claude 3.5 Sonnet is the latest frontier large language model from Anthropic, representing a significant leap in AI capabilities at production-friendly pricing. Released initially in June 2024 as a surprise launch, it delivers Opus-level performance—rivaling or exceeding the company's most capable model—while maintaining Sonnet-tier pricing. The model has been continuously improved through 2024 and 2025, with October 2024 bringing the introduction of computer use capabilities and 2025 iterations showing dramatic improvements in coding performance. Claude 3.5 Sonnet has become the standard choice for developers and organizations seeking state-of-the-art reasoning, vision, and coding abilities without the expense of premium models.

The model stands out for its exceptional versatility across multiple domains: it excels at complex reasoning tasks, delivers best-in-class coding performance surpassing all publicly available models on SWE-bench Verified, provides the strongest vision capabilities yet achieved, and operates at twice the speed of its predecessor. With a 200K token context window and native support for tool use and vision, Claude 3.5 Sonnet enables sophisticated agentic workflows and handles tasks ranging from document analysis to autonomous computer interaction.

Specifications

  • Developer: Anthropic
  • Release Date: June 2024 (initial release); October 2024 (v2 with computer use); 2025 (upgraded version)
  • Type: Large Language Model (LLM), Multimodal (text + vision)
  • Parameters: Undisclosed (estimated 175+ billion)
  • Context Window: 200,000 tokens (approximately 150,000 words or 300 pages)
  • Max Output: 8,192 tokens per response
  • Knowledge Cutoff: April 1, 2024
  • Access: API (Anthropic), Amazon Bedrock, Google Cloud Vertex AI, Claude.ai (free and paid), Claude iOS app
  • Pricing: $3 per million input tokens, $15 per million output tokens

Capabilities

Claude 3.5 Sonnet excels across a remarkably broad range of tasks:

Reasoning & Knowledge

The model achieves 93.1% on BIG-Bench-Hard and demonstrates strong performance on graduate-level reasoning benchmarks (GPQA) and undergraduate knowledge (MMLU), positioning it among the top reasoning models available.

Coding

This is where Claude 3.5 Sonnet demonstrates exceptional strength. It achieves 92.0% accuracy on HumanEval Python function tests (versus GPT-4o's 90.2%) and most impressively, solves 49% of real-world coding tasks on SWE-bench Verified—surpassing OpenAI's o1 preview (45%) and dramatically outperforming previous Claude versions (33%). In internal agentic coding evaluations, it solved 64% of problems compared to Claude 3 Opus's 38%, showing nearly a doubling of performance on complex, multi-step coding challenges.

Vision

Claude 3.5 Sonnet is the strongest vision model currently available, surpassing Claude 3 Opus on all standard vision benchmarks. It achieves 90.8% on Chart Q&A (versus GPT-4o's 85.7%) and excels at visual reasoning tasks including chart interpretation, graph analysis, and accurate text transcription from imperfect or low-quality images.

Computer Use

As of October 2024, Claude 3.5 Sonnet became the first frontier AI model to offer public computer use capability—a beta feature enabling the model to generate computer actions (keystrokes, mouse clicks, window navigation) to accomplish tasks directly on user interfaces. This unlocks novel agentic applications.

Tool Use & Function Calling

The model natively supports tool calling and integration with external functions, enabling seamless integration into complex workflows and automation pipelines.

Limitations

Despite its strengths, Claude 3.5 Sonnet has notable limitations:

Mathematical Reasoning

Pure math performance shows a clear weakness. The model scores 71.1% on the MATH benchmark compared to GPT-4o's 76.6%. Tasks requiring formal proofs, symbolic manipulation, or precise numerical reasoning reveal this gap—the model sometimes arrives at approximately correct answers but struggles with exact solutions.

Character-Level Tasks

The model exhibits difficulty with precise string handling and character-level manipulations, sometimes producing off-by-one errors or other subtle mistakes that break otherwise correct approaches.

Knowledge Cutoff

Training data stops at April 1, 2024. Questions about recent frameworks, regulatory changes, current events, or developments after this date often result in confident-sounding but inaccurate or hallucinated responses rather than honest uncertainty. Users must supplement with external information for tasks requiring up-to-date knowledge.

Hallucinations

While generally reliable, the model occasionally generates plausible-sounding but false information, particularly when asked about topics outside its training data or when handling ambiguous queries. The model tends to answer quickly at the cost of verification, sometimes skipping intermediate reasoning steps.

Message Rate Limits

Some users report frustration with message rate limiting on Claude.ai Pro accounts, particularly an 8-hour rolling window limit that constrains heavy usage during intensive work sessions.

Task Complexity Inconsistencies

Performance can vary unexpectedly based on task complexity. Straightforward queries typically work well, but more complex multi-step requests sometimes encounter reasoning gaps or incomplete solutions.

Recent Developments

October 2024: Computer Use Introduction

Anthropic announced Claude 3.5 Sonnet v2 with computer use capabilities enabled in public beta. This experimental feature allows the model to directly control computers—clicking, typing, and navigating interfaces—marking a significant milestone for autonomous agent development. The upgrade included additional general improvements across reasoning and coding tasks.

2025 Upgrades

Anthropic released an upgraded Claude 3.5 Sonnet with across-the-board improvements, particularly in coding performance. The new version improved SWE-bench Verified performance from 33.4% to 49.0%—a 48% relative improvement—achieving the highest score among all publicly available models. This update significantly strengthened the model's position as the developer's first choice for production coding tasks.

February 2026: Model Succession

Anthropic released Claude Sonnet 4.6 (priced identically to Sonnet 4.5 at $3/$15 per million tokens) on February 17, 2026. As of this release, Claude 3.5 Sonnet models have been retired, with all requests to these models returning an error. Developers are expected to migrate to Sonnet 4.6 or other available Claude models.

Platform Expansion

Claude 3.5 Sonnet became available across multiple major cloud platforms:

  • Google Cloud Vertex AI: Includes the upgraded version with computer use
  • Amazon Bedrock: Full access to Claude 3.5 Sonnet via AWS infrastructure
  • Claude.ai: Free tier access to Claude 3.5 Sonnet; higher rate limits for Pro and Team subscribers
  • Claude iOS App: Mobile access to the full model

Artifacts Feature

Anthropic introduced "Artifacts" on Claude.ai, allowing Claude to generate and display code, text, website designs, and other content in a dedicated side panel alongside conversations, creating an integrated development environment within the chat interface.

Last Updated

February 26, 2026