Anthropic · Closed-API Model

Claude Sonnet 5 “Fennec” Review: 92.4% SWE-Bench Verified, Tested

Anthropic's Claude Sonnet 5 (codenamed “Fennec”) shipped April 1, 2026 and immediately took the #1 spot on SWE-Bench Verified at 92.4% — the highest score for any production model. This review covers the real benchmarks, API pricing ($3/$15 per million tokens), IDE integration in Cursor and Claude Code, and the open-weight alternatives that come closest if you need to self-host.

📅 Published: May 9, 2026🔄 Last Updated: May 9, 2026✓ Manually Reviewed

Note: Claude Sonnet 5 is API-only — it cannot be downloaded or run locally. For coding models you can self-host, see Qwen3-Coder-Next, DeepSeek V4, and Mistral Medium 3.5.

Key takeaways

→92.4% SWE-Bench Verified — current #1 on real GitHub bug-fixing tasks.
→200K context — fits most repositories; prompt caching makes large-context use 90% cheaper.
→$3 / $15 per Mtok — cheaper than GPT-5.5 ($5/$30), more expensive than Gemini 3.1 Pro ($2/$12).
→Default in Cursor + Claude Code — most engineering teams already use it as their primary coding model.
→API-only — for self-hosting, Qwen3-Coder-Next gets you ~75% of Sonnet 5's capability at zero per-token cost.

Quick verdict

Claude Sonnet 5 is the best coding model available right now. If you ship code for a living and you can use an API model, this is the default. It beats every other production model on the benchmarks engineering teams actually care about — SWE-Bench Verified, LiveCodeBench, Aider polyglot, instruction-following on tool use.

Where it loses: huge context (Gemini 3.1 Pro's 1M wins on whole-monorepo work), raw math benchmarks (GPT-5.5 leads AIME), and self-hosting (you can't — the model is API-only). For local-first deployment, Qwen3-Coder-Next is the closest open-weight match.

Specs at a glance

Vendor	Anthropic
Codename	Fennec
Release date	April 1, 2026
Model ID	`claude-sonnet-5-20260401`
Architecture	Dense transformer (parameters not disclosed)
Context window	200,000 tokens
Max output	64,000 tokens
Modalities	Text · Code · Vision
License	Proprietary (Anthropic Terms of Service)
Local self-hostable?	No
Knowledge cutoff	February 2026

Coding benchmarks vs the competition

All scores are vendor-published, cross-checked against third-party leaderboards (Artificial Analysis, BenchLM, SWE-Bench public leaderboard) where available.

Benchmark	Claude Sonnet 5	Claude Opus 4.7	GPT-5.5	Gemini 3.1 Pro	Qwen3-Coder-Next
SWE-Bench Verified	92.4%	87.6%	85.1%	87.9%	70.6%
LiveCodeBench	79.8%	77.2%	76.3%	75.6%	68.4%
Aider polyglot	87.1%	85.4%	81.4%	82.7%	71.2%
HumanEval	95.8%	94.6%	94.2%	93.7%	88.4%
MMLU-Pro (general knowledge)	87.9%	89.4%	90.1%	89.4%	81.7%
GPQA Diamond (PhD science)	85.7%	87.3%	86.0%	88.2%	76.9%

Sources: Anthropic Claude Sonnet 5 announcement (April 2026), SWE-Bench Verified public leaderboard, Aider benchmarks, Artificial Analysis. Scores reflect agent-harness configurations where applicable.

Pricing & access

API (Anthropic, Bedrock, Vertex)

Input: $3.00 per 1M tokens
Output: $15.00 per 1M tokens
Cached input: $0.30 per 1M tokens (90% off)
Batch processing: 50% off (24h SLA)
Tool use: No surcharge

Subscription tiers

Claude Free: Limited Sonnet 5 access via claude.ai
Claude Pro: $20/mo — 5× the free limit
Claude Max: $100-200/mo — 5-20× Pro limits
Claude Team: $30/user/mo — collaboration features
Claude Enterprise: Custom — SSO, audit log, custom limits

For comparison: a 4-hour-per-day developer using Sonnet 5 through Cursor or Claude Code typically pays $30-150/month in API spend. A self-hosted Qwen3-Coder-Next on a $4K rig pays for itself in about 18 months with similar (though not equal) coding quality.

IDE integration

Claude Sonnet 5 is supported by every major AI coding tool. The most common production setups:

Cursor

Default model in Cursor 2.x. Selectable as “sonnet-5” in the model picker. Bring your own API key for unlimited use, or use Cursor's included credits ($20/mo Pro plan).

Claude Code (CLI)

Anthropic's official CLI uses Sonnet 5 by default. Best for agentic, terminal-driven workflows. Direct API billing.

Continue.dev / Aider / Cline

Open-source coding assistants — set Sonnet 5 as the default in config. Aider in particular benchmarks highest with Sonnet 5.

GitHub Copilot Pro+

Sonnet 5 available as a premium model selector ($39/user/mo). Slower rollout than Cursor, more enterprise-friendly billing.

Open-weight alternatives you can run locally

None of these match Sonnet 5's coding quality directly, but they get close enough that most teams use them as the default and reach for Sonnet 5 only on the hardest 10-20% of tasks. All four are fully self-hostable.

Model	SWE-Bench Verified	License	Hardware floor
Qwen3-Coder-Next	70.6%	Apache 2.0	2× RTX 5090 (consumer)
DeepSeek V4-Pro	82.6%	MIT	8× H100
Mistral Medium 3.5	77.6%	Modified MIT	4× H100
GLM-5	77.8%	MIT	4× H100
Qwen3.6-27B	68.9%	Apache 2.0	1× RTX 5090 / 1× H100

When to pick Claude Sonnet 5

✓You ship code for a living and need the highest-quality model available.
✓Your workflow is in Cursor, Claude Code, or Aider — where Sonnet 5 is the default and most-tested.
✓Your codebase fits comfortably in 200K tokens (most do).
✓You can absorb $30-150/month in API spend per developer.

When to use a local model instead

→Your code or data cannot leave your network (regulated industry, government, defense).
→You're hitting Sonnet 5 hundreds of times per hour for routine completions where 70-80% quality is fine.
→You want predictable monthly costs (one-time hardware vs ongoing API).
→You're building a coding-AI product and need the model to be your moat.

Frequently asked questions

Can I run Claude Sonnet 5 locally?

No. Claude Sonnet 5 is a proprietary Anthropic model accessible only through the Anthropic API, AWS Bedrock, Google Cloud Vertex AI, and the Claude.ai web/desktop apps. It cannot be downloaded or self-hosted. For coding-focused models you can run on your own hardware, the closest open-weight alternatives are Qwen3-Coder-Next (~70% SWE-Bench Verified, 80B/3B active MoE), DeepSeek V4-Pro (82.6% SWE-Bench), and Mistral Medium 3.5 (77.6% SWE-Bench).

How much does Claude Sonnet 5 cost?

Claude Sonnet 5 API pricing is $3 per million input tokens and $15 per million output tokens. Cached input drops to $0.30 per million (90% savings). The Claude Pro consumer subscription is $20/month with usage limits; Claude Max is $100-200/month for power users. Anthropic also offers prompt caching, batch processing (50% off), and tool-use without separate cost. For typical daily coding use through Cursor or Claude Code, expect $30-150/month in API spend.

What does the 92.4% SWE-Bench Verified score mean?

SWE-Bench Verified contains 500 real GitHub issues from popular Python repositories (Django, Flask, scikit-learn, Sympy, etc.). A model is judged correct only when its patch makes the failing tests pass without breaking existing ones. Claude Sonnet 5's 92.4% means it correctly resolves 462 out of 500 production-grade bugs without human help — the highest verified score for any model as of May 2026. For comparison: Gemini 3.1 Pro 87.9%, GPT-5.5 85.1%, DeepSeek V4-Pro 82.6%.

Claude Sonnet 5 vs Opus 4.7: which should I use?

Sonnet 5 ($3/$15) is the better choice for almost all coding work — it scores higher on SWE-Bench Verified (92.4%) than Opus 4.7 (87.6%), runs faster, and costs less. Use Opus 4.7 when you need maximum reasoning depth: complex refactors across many files, architectural decisions, or research-grade analysis. Opus 4.7 is also the only Claude with the new "Adaptive Thinking" mode that auto-tunes reasoning depth per request. For 95% of engineering work, Sonnet 5 is the default; reach for Opus 4.7 only when Sonnet 5 visibly struggles.

Claude Sonnet 5 vs GPT-5.5: which is better for coding?

Claude Sonnet 5 is currently the better coding model on every coding benchmark we track. SWE-Bench Verified: Sonnet 5 92.4% vs GPT-5.5 85.1%. LiveCodeBench: Sonnet 5 79.8% vs GPT-5.5 76.3%. Aider polyglot: Sonnet 5 87.1% vs GPT-5.5 81.4%. GPT-5.5 leads on raw speed and ChatGPT ecosystem features (custom GPTs, plugins, function-calling maturity). For pure code quality, Sonnet 5 wins. Most production teams use both: Sonnet 5 in Cursor/Claude Code for primary work, GPT-5.5 for cases where ChatGPT-specific tools matter.

How does Claude Sonnet 5's context window compare?

Claude Sonnet 5 has a 200,000-token context window — large enough for most repository work but smaller than Gemini 3.1 Pro (1M tokens). For repos beyond 200K tokens, you can use Anthropic's prompt caching (90% cheaper for repeated context) or break the work into chunks. Output capacity is up to 64,000 tokens, sufficient for full file rewrites. Most engineering workflows fit comfortably inside 200K — only whole-monorepo analysis or hour-long video transcripts need more.

Which IDEs and tools support Claude Sonnet 5?

Claude Sonnet 5 is the default model in Cursor (sonnet-5 selectable from the model picker), Claude Code (Anthropic's official CLI), Continue.dev, Aider, Cline, GitHub Copilot Pro+ (premium tier), Zed, and Windsurf. Anthropic also publishes the official Python SDK, TypeScript SDK, and direct REST API. Most teams use Cursor or Claude Code for interactive work and the API for production agents. The model identifier on the API is `claude-sonnet-5-20260401`.

When is an open-weight model good enough vs Sonnet 5?

For routine code completion and refactors, Qwen3-Coder-Next or DeepSeek V4-Pro come within 10-12% of Sonnet 5's SWE-Bench score — and you can run them locally with zero per-token cost. Sonnet 5 wins decisively on hardest problems: novel algorithm design, multi-file refactors, ambiguous specs, or production-grade testing where small errors compound. Most teams get 80-90% of Sonnet 5's value from a local Qwen3-Coder-Next deployment for the routine work, then reach for Sonnet 5 via API for the hard 10-20%. This blend cuts costs 60-80%.

Want to build a hybrid Sonnet 5 + local-AI setup?

Local AI Master's AI Engineering and Local Deployment courses cover hybrid setups — local model for routine work, Sonnet 5 for hard tasks. Real production code, full GitHub repos, no vendor lock-in.

See the AI Engineering course →

Related models

→ Claude Opus 4.7 — Anthropic's deepest reasoner, Adaptive Thinking mode
→ Gemini 3.1 Pro — 1M context + thinking tiers, best for whole-monorepo work
→ GPT-5.5 — current ChatGPT default, strongest math
→ Qwen3-Coder-Next — best self-hostable coding model
→ DeepSeek V4 — open-weight frontier, MIT licensed
→ Best AI models May 2026: complete comparison