Anthropic · Closed-API Model

Claude Opus 4.7: Adaptive Thinking, Tested

Claude Opus 4.7 (April 16, 2026) is Anthropic's flagship reasoning model and the first model with Adaptive Thinking — a mode that auto-tunes thinking compute per request based on task difficulty. Scores 87.6% on SWE-Bench Verified Adaptive harness, 200K context, $15/$75 per million tokens. This review covers when Opus 4.7 is worth the 5× premium over Sonnet 5, how Adaptive Thinking actually works, and which open-weight models come closest if you need local deep reasoning.

📅 Published: May 9, 2026🔄 Last Updated: May 9, 2026✓ Manually Reviewed

Note: Opus 4.7 is API-only — cannot be downloaded or run locally. For deep reasoning you can self-host, see DeepSeek V4-Pro, Kimi K2.6, and GLM-5.

Key takeaways

→Adaptive Thinking — model auto-tunes reasoning compute per request, no manual tier selection.
→87.6% SWE-Bench Verified Adaptive — strong, but Sonnet 5 still leads at 92.4% on Verified.
→$15/$75 per Mtok — 5× more than Sonnet 5; reserve for hardest 10% of tasks.
→200K context — same as Sonnet 5; works with prompt caching.
→Best for: research-grade analysis, novel algorithm design, complex multi-step reasoning.

Quick verdict

Use Claude Sonnet 5 for ~90% of coding work — it's 5× cheaper, faster, and actually scores higher on standard SWE-Bench. Reach for Opus 4.7 only when the task visibly defeats Sonnet 5: complex multi-file refactors, novel algorithm design, deep research analysis, or anywhere Adaptive Thinking earns its keep.

For privacy-required deep-reasoning workloads where you need local hosting, DeepSeek V4-Pro (8× H100) or Kimi K2.6 (also 8× H100) are the realistic alternatives.

Specs at a glance

Vendor	Anthropic
Release date	April 16, 2026
Model ID	`claude-opus-4-7-20260416`
Architecture	Dense transformer with Adaptive Thinking
Context window	200,000 tokens
Max output	64,000 tokens (excluding thinking trace)
Modalities	Text · Code · Vision
License	Proprietary
Local self-hostable?	No

Adaptive Thinking explained

Most reasoning models (OpenAI o-series, Gemini 3.1 Pro thinking tiers) make you choose how much thinking to apply per request. Opus 4.7 changes this — the model decides for itself based on task difficulty.

Mechanism: a lightweight pre-pass classifies the input difficulty, then the model spends correspondingly more compute thinking before answering. Easy questions (e.g., “what's 2+2”) get fast responses with no thinking. Medium complexity gets a brief thinking pass. Hard problems trigger extended reasoning over seconds to minutes.

Why it matters: removes the cognitive overhead of picking a thinking tier. For dev tools and agents, Adaptive Thinking means you set the model once and it auto-routes per-task complexity. Cost implication: output tokens include the thinking trace, so a hard problem can multiply effective cost by 2-4× — but you only pay it when needed.

Compare to Gemini 3.1 Pro's explicit Tier 1/2/3 system or GPT-5.5's Instant/Standard/Pro split. Opus 4.7's approach is more elegant but less predictable for cost forecasting.

Benchmarks

Benchmark	Opus 4.7	Sonnet 5	GPT-5.5 Pro	Gemini 3.1 Pro
SWE-Bench Verified Adaptive	87.6%	92.4%	88.4%	87.9%
GPQA Diamond (PhD science)	87.3%	85.7%	87.6%	88.2%
AIME 2025 (math)	92.8%	91.5%	96.4%	94.0%
ARC-AGI-2 (reasoning)	71.8%	68.4%	73.5%	77.1%
MMLU-Pro	89.4%	87.9%	90.6%	89.4%

Pricing & access

API

Input: $15.00 per 1M tokens
Output (incl. thinking): $75.00 per 1M tokens
Cached input: $1.50 per 1M (90% off)
Batch: 50% off (24h SLA)

Subscription

Claude Pro: $20/mo — limited Opus 4.7
Claude Max: $100-200/mo — much higher Opus quota
Bedrock / Vertex: Same per-token pricing

When to pick Opus 4.7

✓Hardest 5-10% of tasks where Sonnet 5 visibly struggles.
✓Research-grade analysis, novel algorithm design, deep multi-step reasoning.
✓Workloads where Adaptive Thinking's auto-routing eliminates manual tier selection.
✓Anthropic-stack teams already using Sonnet 5; Opus 4.7 is the natural escalation.

FAQ

What is Claude Opus 4.7?

Claude Opus 4.7 is Anthropic's flagship reasoning model, released April 16, 2026. It's the “Opus tier” sibling to Claude Sonnet 5 — same era, but designed for deepest reasoning rather than coding throughput. Headline feature: Adaptive Thinking, a mode where the model auto-tunes how much thinking compute to apply per request based on task difficulty. Scores 87.6% on SWE-Bench Verified Adaptive harness, 200K context, $15 input / $75 output per million tokens. Most production teams use Sonnet 5 ($3/$15) by default and reach for Opus 4.7 only on hardest 5-10% of tasks.

Opus 4.7 vs Sonnet 5: when do I use each?

Use Sonnet 5 ($3/$15) for nearly all coding work — it scores higher on SWE-Bench Verified (92.4% vs 87.6%), runs faster, and costs 5× less. Use Opus 4.7 when: 1) Sonnet 5 visibly struggles on a problem (multi-file refactors with subtle dependencies, novel algorithm design, ambiguous specs requiring careful reasoning); 2) You need Adaptive Thinking to auto-tune compute per request; 3) Research-grade analysis where every percent of accuracy matters. The cost difference is 5× — only worth it for the hardest tasks. Most production engineers route 90% of work through Sonnet 5 and 10% through Opus 4.7.

What is Adaptive Thinking?

Adaptive Thinking is a new mode in Opus 4.7 where the model automatically decides how much thinking time to spend based on task difficulty. Easy questions get fast responses (similar to Sonnet 5 latency). Hard questions trigger longer internal reasoning before output, similar to OpenAI o-series or Gemini 3.1 Pro's thinking tiers — but auto-routed instead of user-selected. Visible in the API as a `thinking` field showing the reasoning trace. Reduces the cognitive overhead of picking a thinking tier per request. Currently only available on Opus 4.7; expected to come to Sonnet later in 2026.

How much does Claude Opus 4.7 cost?

Claude Opus 4.7 API pricing is $15 per million input tokens and $75 per million output tokens — 5× more expensive than Sonnet 5 ($3/$15). With Adaptive Thinking, output tokens include the model's thinking trace, which can multiply effective cost by 2-4× on hard problems. Cached input is $1.50 per Mtok (90% off). Available through Claude Pro ($20/mo with usage caps), Claude Max ($100-200/mo), Anthropic API, AWS Bedrock, and Google Cloud Vertex AI. Heavy Opus 4.7 API users typically pay $200-1,500/month — most cap usage by routing easy work to Sonnet 5.

Can I run Claude Opus 4.7 locally?

No — like all Claude models, Opus 4.7 is API-only. For local-first deep reasoning, the closest open-weight alternatives are DeepSeek V4-Pro (82.6% SWE-Bench, MIT licensed, but needs 8× H100), Kimi K2.6 (85.4% SWE-Bench Agentic, 1T MoE, also 8× H100), and GLM-5 (77.8% SWE-Bench, smaller hardware footprint at 4× H100). None match Opus 4.7's Adaptive Thinking, but they get within 5-10 points on standard benchmarks.

Opus 4.7 vs GPT-5.5 Pro: which is better for hard reasoning?

Both are extended-thinking models in the top reasoning tier. AIME 2025: GPT-5.5 Pro 96.4% vs Opus 4.7 92.8%. SWE-Bench Verified: Opus 4.7 87.6% vs GPT-5.5 Pro 88.4%. GPQA Diamond: Opus 4.7 87.3% vs GPT-5.5 Pro 87.6%. ARC-AGI-2: Opus 4.7 71.8% vs GPT-5.5 Pro 73.5%. The two models are nearly tied on most reasoning benchmarks. Choose by ecosystem: Anthropic-stack teams pick Opus 4.7, OpenAI-stack teams pick GPT-5.5 Pro. Pricing: GPT-5.5 Pro is cheaper at $15/$60 vs Opus 4.7 at $15/$75.

Related models

→ Claude Sonnet 5 — your default for ~90% of coding work
→ Claude Opus 4.1 — predecessor; lower benchmark scores
→ GPT-5.5 Pro — closest closed alternative; tied on most reasoning
→ Gemini 3.1 Pro — explicit thinking tiers + 1M context
→ DeepSeek V4-Pro — open-weight deep-reasoning alternative
→ Best AI models May 2026 — pillar comparison