OpenAI · Closed-API Model

GPT-5.5 Review: ChatGPT's New Default, Tested

OpenAI shipped GPT-5.5 between April 23 and May 5, 2026 across three tiers — Instant, Standard, and Pro — and made it the default model for all ChatGPT users. This review covers what changed from GPT-5, real benchmark performance (95.2% AIME math, 85.1% SWE-Bench), the new 400K context window, pricing across all three tiers, and which open-weight alternatives come closest if you need to self-host.

📅 Published: May 9, 2026🔄 Last Updated: May 9, 2026✓ Manually Reviewed

Note: GPT-5.5 is API-only — it cannot be downloaded or run locally. For self-hostable models with comparable performance, see DeepSeek V4, GLM-5, and Qwen3-Coder-Next.

Key takeaways

→3 tiers: Instant ($1.50/$6), Standard ($5/$30), Pro ($15/$60) per million tokens.
→400K context — 4× GPT-5's 128K, 2× Claude Sonnet 5's 200K.
→95.2% AIME 2025 — top math benchmark, beats Sonnet 5 (91.5%) and Gemini 3.1 (94.0%).
→85.1% SWE-Bench Verified — solid coding but well behind Claude Sonnet 5 (92.4%).
→API-only — for self-hosting, DeepSeek V4-Pro is the closest frontier open-weight match.

Quick verdict

GPT-5.5 is the default ChatGPT model and the best general-purpose closed model for math-heavy and ChatGPT-ecosystem workloads. If you live in ChatGPT, use custom GPTs, or rely on OpenAI's function-calling and tools APIs, this is the model you should use.

Where it loses: coding (Claude Sonnet 5 is 7+ points ahead on SWE-Bench), context (Gemini 3.1 Pro's 1M wins for whole-monorepo analysis), and pricing ($5/$30 is the most expensive of the three frontier closed models). For local-first deployment, DeepSeek V4-Pro gets you most of GPT-5.5's capabilities at zero per-token cost.

Specs at a glance

Vendor	OpenAI
Release date	April 23, 2026 (standard) · May 5, 2026 (Instant)
Tiers	Instant · Standard · Pro
Context window	400,000 tokens
Max output	32,000 tokens
Modalities	Text · Code · Image · Audio (via Realtime API)
License	Proprietary (OpenAI Terms of Service)
Local self-hostable?	No
API model IDs	`gpt-5.5` · `gpt-5.5-instant` · `gpt-5.5-pro`
Knowledge cutoff	March 2026

Three tiers explained

GPT-5.5 ships in three tiers. Picking the right one for each request controls cost and latency.

GPT-5.5 Instant — $1.50 / $6 per Mtok

Sub-200ms latency. Smaller activated parameters, same knowledge cutoff. Use for chat, autocomplete, classification, simple Q&A. Quality ~80-85% of Standard.

GPT-5.5 Standard — $5 / $30 per Mtok

The default ChatGPT model. Best balance of speed (~1-3 sec first token) and quality. Use for coding, content, analysis, function calling. Most production work belongs here.

GPT-5.5 Pro (Extended Thinking) — $15 / $60 per Mtok

Extended reasoning over 30 seconds to several minutes. Use for hardest math, novel algorithms, ambiguous specs, research-grade analysis. Available via ChatGPT Pro ($200/mo) and API at higher rate limits.

Benchmarks vs the competition

Benchmark	GPT-5.5	GPT-5	Claude Sonnet 5	Gemini 3.1 Pro
AIME 2025 (math)	95.2%	87.4%	91.5%	94.0%
MMLU-Pro (knowledge)	90.1%	86.2%	87.9%	89.4%
SWE-Bench Verified	85.1%	74.9%	92.4%	87.9%
ARC-AGI-2 (reasoning)	71.3%	58.4%	68.4%	77.1%
GPQA Diamond	86.0%	83.1%	85.7%	88.2%
HumanEval	94.2%	92.6%	95.8%	93.7%

Sources: OpenAI GPT-5.5 release notes (April-May 2026), Anthropic Sonnet 5 announcement, Google Gemini 3.1 Pro model card, Artificial Analysis leaderboard.

Pricing & access

API pricing

Instant: $1.50 / $6.00 per 1M tokens
Standard: $5.00 / $30.00 per 1M tokens
Pro: $15.00 / $60.00 per 1M tokens
Cached input: 50% off
Batch API: 50% off (24h SLA)

Subscription tiers

ChatGPT Free: Limited GPT-5.5 Instant
ChatGPT Plus: $20/mo — Standard with daily caps
ChatGPT Pro: $200/mo — unlimited Standard + Pro
Team: $30/user/mo — collaboration
Enterprise: Custom — SSO, audit, custom limits

Heavy GPT-5.5 API users typically pay $50-400/month. For comparison, a self-hosted DeepSeek V4 on a $10-15K multi-GPU rig pays for itself in 2-3 years and gives you unlimited inference + full data privacy.

When to pick GPT-5.5 vs alternatives

Workload	Best pick	Why
Math / scientific computation	GPT-5.5 Pro	95.2% AIME 2025; extended thinking handles hardest problems.
Production coding	Claude Sonnet 5	7+ points ahead on SWE-Bench Verified.
Whole-codebase analysis	Gemini 3.1 Pro	1M context vs GPT-5.5's 400K.
ChatGPT ecosystem (custom GPTs, plugins)	GPT-5.5 Standard	Native integration; nothing else comes close.
High-volume routine tasks	GPT-5.5 Instant	$1.50/$6 per Mtok at 80-85% of Standard quality.
Privacy-required workloads	DeepSeek V4	Self-hostable, MIT licensed, 1M context.

Open-weight alternatives

GPT-5.5 is API-only. If you need a frontier-class model that you can self-host — for privacy, cost, or offline operation — these come closest:

Model	License	Active params	Strength
DeepSeek V4-Pro	MIT	49B (1.6T total)	Closest general-purpose match
GLM-5	MIT	44B (745B total)	77.8% SWE-Bench, smaller hardware footprint
Kimi K2.6	Modified MIT	32B (1T total)	Ties GPT-5.5 on coding
Mistral Medium 3.5	Modified MIT	128B dense	Runs on 4 GPUs, near-frontier

When to pick GPT-5.5

✓You live in ChatGPT, use custom GPTs, or rely on OpenAI plugins/function-calling.
✓Math-heavy workloads (95.2% AIME is the highest production score).
✓You need three tiers (Instant / Standard / Pro) for cost-sensitive routing.
✓You're on Azure and need first-class OpenAI integration.

When to pick something else

→Coding-heavy work → Claude Sonnet 5 (92.4% vs 85.1% on SWE-Bench).
→Whole-monorepo analysis → Gemini 3.1 Pro (1M context).
→Privacy-required → DeepSeek V4 or other self-hosted open weight.
→Predictable monthly costs → any local-first deployment.

Frequently asked questions

Can I run GPT-5.5 locally?

No. GPT-5.5 is a proprietary OpenAI model accessible only through the OpenAI API, ChatGPT (web/desktop/mobile), Microsoft Azure OpenAI Service, and integrations like Cursor or GitHub Copilot. It cannot be downloaded or self-hosted. For frontier-class models you can run on your own hardware, the closest open-weight alternatives are DeepSeek V4-Pro (MIT licensed, 1M context), GLM-5 (745B/44B active MoE, MIT), and Qwen3.5-Plus (397B/17B active).

How much does GPT-5.5 cost?

GPT-5.5 API pricing varies by tier: GPT-5.5 standard is $5 input / $30 output per 1M tokens; GPT-5.5 Pro (extended thinking) is $15 / $60; GPT-5.5 Instant (faster, lower quality) is $1.50 / $6. ChatGPT Plus is $20/month with usage caps; ChatGPT Pro is $200/month for unlimited GPT-5.5 + Pro reasoning + early features. API users typically pay $50-400/month. For comparison, Claude Sonnet 5 is $3/$15 (cheaper) and Gemini 3.1 Pro is $2/$12 (cheapest of the three frontier closed models).

GPT-5.5 vs GPT-5: what changed?

GPT-5.5 (April-May 2026) brings three improvements over GPT-5 (August 2025). 1) Larger context: 400K tokens vs GPT-5's 128K. 2) Better math/reasoning: AIME 2025 score jumps from 87.4% to 95.2%. 3) Three deployment tiers (Instant/Standard/Pro) so you can trade speed for quality per request. Coding improved modestly (74.9% → 85.1% SWE-Bench Verified). Pricing increased — GPT-5 was $4/$16 per Mtok, GPT-5.5 standard is $5/$30. For most existing GPT-5 users, the upgrade is automatic via ChatGPT; API users need to update their model identifier to gpt-5.5-2026-04-23.

GPT-5.5 vs Claude Sonnet 5 vs Gemini 3.1 Pro: which is best?

Different leaders for different jobs. GPT-5.5 leads on math (95.2% AIME) and ChatGPT ecosystem (custom GPTs, plugins, function-calling maturity). Claude Sonnet 5 leads on coding (92.4% SWE-Bench Verified vs GPT-5.5's 85.1%) and instruction following. Gemini 3.1 Pro leads on context window (1M vs 400K vs 200K) and ARC-AGI-2 reasoning (77.1% vs 71.3% vs 68.4%). Pricing: Gemini 3.1 cheapest, Sonnet 5 mid, GPT-5.5 most expensive. Most production teams pick by workflow: ChatGPT users → GPT-5.5, Cursor/Claude Code users → Sonnet 5, Google Cloud users → Gemini 3.1 Pro.

What is GPT-5.5 Pro and when should I use it?

GPT-5.5 Pro is the extended-thinking variant — it spends more compute thinking through hard problems before responding. Use it for: complex math (AIME-level competition problems), novel algorithm design, multi-step research questions, and ambiguous specifications where standard GPT-5.5 makes mistakes. It costs 3× standard GPT-5.5 ($15/$60 vs $5/$30) and takes 30 seconds to several minutes per response. For routine code and chat, GPT-5.5 standard is faster and 90% as good. Available only through ChatGPT Pro ($200/mo), API at higher rate limits, and via the o-series successor pipeline.

What is GPT-5.5 Instant?

GPT-5.5 Instant is the fastest, cheapest tier — optimized for sub-200ms latency at $1.50/$6 per Mtok. It uses smaller activated parameters (similar to o4-mini-instant) but inherits GPT-5.5's knowledge cutoff and tool-use behavior. Use it for: chat assistants, autocomplete, classification, simple Q&A, and any latency-sensitive UI. Quality is roughly 80-85% of standard GPT-5.5 on most benchmarks. Available via API as gpt-5.5-instant and selectable in the ChatGPT model picker for free-tier users.

How big is GPT-5.5's context window?

400,000 tokens — about 4× larger than GPT-5's 128K and 2× larger than Claude Sonnet 5's 200K. Output is up to 32,000 tokens per response. With prompt caching enabled (50% off cached input), large-context use cases like whole-codebase analysis become economically practical. Still smaller than Gemini 3.1 Pro's 1M-token context. For workloads that need more than 400K, the typical pattern is to split work across calls or use Gemini 3.1 Pro for the analysis stage and GPT-5.5 for code generation.

When should I use an open-weight model instead of GPT-5.5?

Use an open-weight alternative when: 1) Your data cannot leave your network (regulated industries, defense, IP-sensitive). 2) You need predictable monthly costs — one-time hardware ($3-15K) replaces ongoing $50-400/mo API bills. 3) You're building a product and don't want OpenAI to be your single point of failure. 4) Sub-100ms latency matters more than absolute peak quality. The closest open-weight matches: DeepSeek V4-Pro for general work, Qwen3-Coder-Next for coding-heavy workflows, GLM-5 for reasoning. Most engineering teams that adopt local models keep GPT-5.5 for the hardest 5-10% of cases and use local for everything else — typical cost reduction is 60-85%.

Cut your GPT-5.5 API bill 60-85%

The Local AI Master deployment course shows you how to run open-weight alternatives like DeepSeek V4 and Qwen3-Coder-Next on your own hardware. Use them for the routine 80% of traffic, keep GPT-5.5 for the hard 20%.

See the deployment course →

Related models

→ GPT-5 — predecessor; still available, cheaper at $4/$16
→ Claude Sonnet 5 — current SWE-Bench leader for coding
→ Gemini 3.1 Pro — 1M context + thinking tiers
→ DeepSeek V4 — open-weight frontier alternative
→ Kimi K2.6 — open-weight 1T MoE that ties GPT-5.5 on coding
→ Best AI models May 2026: complete comparison