★ Reading this for free? Get 17 structured AI courses + per-chapter AI tutor — the first chapter of every course free, no card.Start free in 30 seconds

OpenAI · Closed-API Model

GPT-5.5 Review: ChatGPT's New Default, Tested

OpenAI shipped GPT-5.5 between April 23 and May 5, 2026 across three tiers — Instant, Standard, and Pro — and made it the default model for all ChatGPT users. This review covers what changed from GPT-5, real benchmark performance (95.2% AIME math, 85.1% SWE-Bench), the new 400K context window, pricing across all three tiers, and which open-weight alternatives come closest if you need to self-host.

📅 Published: May 9, 2026🔄 Last Updated: May 9, 2026✓ Manually Reviewed

Note: GPT-5.5 is API-only — it cannot be downloaded or run locally. For self-hostable models with comparable performance, see DeepSeek V4, GLM-5, and Qwen3-Coder-Next.

Key takeaways

  • 3 tiers: Instant ($1.50/$6), Standard ($5/$30), Pro ($15/$60) per million tokens.
  • 400K context — 4× GPT-5's 128K, 2× Claude Sonnet 5's 200K.
  • 95.2% AIME 2025 — top math benchmark, beats Sonnet 5 (91.5%) and Gemini 3.1 (94.0%).
  • 85.1% SWE-Bench Verified — solid coding but well behind Claude Sonnet 5 (92.4%).
  • API-only — for self-hosting, DeepSeek V4-Pro is the closest frontier open-weight match.

Quick verdict

GPT-5.5 is the default ChatGPT model and the best general-purpose closed model for math-heavy and ChatGPT-ecosystem workloads. If you live in ChatGPT, use custom GPTs, or rely on OpenAI's function-calling and tools APIs, this is the model you should use.

Where it loses: coding (Claude Sonnet 5 is 7+ points ahead on SWE-Bench), context (Gemini 3.1 Pro's 1M wins for whole-monorepo analysis), and pricing ($5/$30 is the most expensive of the three frontier closed models). For local-first deployment, DeepSeek V4-Pro gets you most of GPT-5.5's capabilities at zero per-token cost.

Specs at a glance

VendorOpenAI
Release dateApril 23, 2026 (standard) · May 5, 2026 (Instant)
TiersInstant · Standard · Pro
Context window400,000 tokens
Max output32,000 tokens
ModalitiesText · Code · Image · Audio (via Realtime API)
LicenseProprietary (OpenAI Terms of Service)
Local self-hostable?No
API model IDsgpt-5.5 · gpt-5.5-instant · gpt-5.5-pro
Knowledge cutoffMarch 2026

Three tiers explained

GPT-5.5 ships in three tiers. Picking the right one for each request controls cost and latency.

GPT-5.5 Instant — $1.50 / $6 per Mtok

Sub-200ms latency. Smaller activated parameters, same knowledge cutoff. Use for chat, autocomplete, classification, simple Q&A. Quality ~80-85% of Standard.

GPT-5.5 Standard — $5 / $30 per Mtok

The default ChatGPT model. Best balance of speed (~1-3 sec first token) and quality. Use for coding, content, analysis, function calling. Most production work belongs here.

GPT-5.5 Pro (Extended Thinking) — $15 / $60 per Mtok

Extended reasoning over 30 seconds to several minutes. Use for hardest math, novel algorithms, ambiguous specs, research-grade analysis. Available via ChatGPT Pro ($200/mo) and API at higher rate limits.

Benchmarks vs the competition

BenchmarkGPT-5.5GPT-5Claude Sonnet 5Gemini 3.1 Pro
AIME 2025 (math)95.2%87.4%91.5%94.0%
MMLU-Pro (knowledge)90.1%86.2%87.9%89.4%
SWE-Bench Verified85.1%74.9%92.4%87.9%
ARC-AGI-2 (reasoning)71.3%58.4%68.4%77.1%
GPQA Diamond86.0%83.1%85.7%88.2%
HumanEval94.2%92.6%95.8%93.7%

Sources: OpenAI GPT-5.5 release notes (April-May 2026), Anthropic Sonnet 5 announcement, Google Gemini 3.1 Pro model card, Artificial Analysis leaderboard.

Pricing & access

API pricing

  • Instant: $1.50 / $6.00 per 1M tokens
  • Standard: $5.00 / $30.00 per 1M tokens
  • Pro: $15.00 / $60.00 per 1M tokens
  • Cached input: 50% off
  • Batch API: 50% off (24h SLA)

Subscription tiers

  • ChatGPT Free: Limited GPT-5.5 Instant
  • ChatGPT Plus: $20/mo — Standard with daily caps
  • ChatGPT Pro: $200/mo — unlimited Standard + Pro
  • Team: $30/user/mo — collaboration
  • Enterprise: Custom — SSO, audit, custom limits

Heavy GPT-5.5 API users typically pay $50-400/month. For comparison, a self-hosted DeepSeek V4 on a $10-15K multi-GPU rig pays for itself in 2-3 years and gives you unlimited inference + full data privacy.

When to pick GPT-5.5 vs alternatives

WorkloadBest pickWhy
Math / scientific computationGPT-5.5 Pro95.2% AIME 2025; extended thinking handles hardest problems.
Production codingClaude Sonnet 57+ points ahead on SWE-Bench Verified.
Whole-codebase analysisGemini 3.1 Pro1M context vs GPT-5.5's 400K.
ChatGPT ecosystem (custom GPTs, plugins)GPT-5.5 StandardNative integration; nothing else comes close.
High-volume routine tasksGPT-5.5 Instant$1.50/$6 per Mtok at 80-85% of Standard quality.
Privacy-required workloadsDeepSeek V4Self-hostable, MIT licensed, 1M context.

Open-weight alternatives

GPT-5.5 is API-only. If you need a frontier-class model that you can self-host — for privacy, cost, or offline operation — these come closest:

ModelLicenseActive paramsStrength
DeepSeek V4-ProMIT49B (1.6T total)Closest general-purpose match
GLM-5MIT44B (745B total)77.8% SWE-Bench, smaller hardware footprint
Kimi K2.6Modified MIT32B (1T total)Ties GPT-5.5 on coding
Mistral Medium 3.5Modified MIT128B denseRuns on 4 GPUs, near-frontier

When to pick GPT-5.5

  • You live in ChatGPT, use custom GPTs, or rely on OpenAI plugins/function-calling.
  • Math-heavy workloads (95.2% AIME is the highest production score).
  • You need three tiers (Instant / Standard / Pro) for cost-sensitive routing.
  • You're on Azure and need first-class OpenAI integration.

When to pick something else

  • Coding-heavy work → Claude Sonnet 5 (92.4% vs 85.1% on SWE-Bench).
  • Whole-monorepo analysis → Gemini 3.1 Pro (1M context).
  • Privacy-required → DeepSeek V4 or other self-hosted open weight.
  • Predictable monthly costs → any local-first deployment.

Frequently asked questions

Can I run GPT-5.5 locally?
No. GPT-5.5 is a proprietary OpenAI model accessible only through the OpenAI API, ChatGPT (web/desktop/mobile), Microsoft Azure OpenAI Service, and integrations like Cursor or GitHub Copilot. It cannot be downloaded or self-hosted. For frontier-class models you can run on your own hardware, the closest open-weight alternatives are DeepSeek V4-Pro (MIT licensed, 1M context), GLM-5 (745B/44B active MoE, MIT), and Qwen3.5-Plus (397B/17B active).
How much does GPT-5.5 cost?
GPT-5.5 API pricing varies by tier: GPT-5.5 standard is $5 input / $30 output per 1M tokens; GPT-5.5 Pro (extended thinking) is $15 / $60; GPT-5.5 Instant (faster, lower quality) is $1.50 / $6. ChatGPT Plus is $20/month with usage caps; ChatGPT Pro is $200/month for unlimited GPT-5.5 + Pro reasoning + early features. API users typically pay $50-400/month. For comparison, Claude Sonnet 5 is $3/$15 (cheaper) and Gemini 3.1 Pro is $2/$12 (cheapest of the three frontier closed models).
GPT-5.5 vs GPT-5: what changed?
GPT-5.5 (April-May 2026) brings three improvements over GPT-5 (August 2025). 1) Larger context: 400K tokens vs GPT-5's 128K. 2) Better math/reasoning: AIME 2025 score jumps from 87.4% to 95.2%. 3) Three deployment tiers (Instant/Standard/Pro) so you can trade speed for quality per request. Coding improved modestly (74.9% → 85.1% SWE-Bench Verified). Pricing increased — GPT-5 was $4/$16 per Mtok, GPT-5.5 standard is $5/$30. For most existing GPT-5 users, the upgrade is automatic via ChatGPT; API users need to update their model identifier to gpt-5.5-2026-04-23.
GPT-5.5 vs Claude Sonnet 5 vs Gemini 3.1 Pro: which is best?
Different leaders for different jobs. GPT-5.5 leads on math (95.2% AIME) and ChatGPT ecosystem (custom GPTs, plugins, function-calling maturity). Claude Sonnet 5 leads on coding (92.4% SWE-Bench Verified vs GPT-5.5's 85.1%) and instruction following. Gemini 3.1 Pro leads on context window (1M vs 400K vs 200K) and ARC-AGI-2 reasoning (77.1% vs 71.3% vs 68.4%). Pricing: Gemini 3.1 cheapest, Sonnet 5 mid, GPT-5.5 most expensive. Most production teams pick by workflow: ChatGPT users → GPT-5.5, Cursor/Claude Code users → Sonnet 5, Google Cloud users → Gemini 3.1 Pro.
What is GPT-5.5 Pro and when should I use it?
GPT-5.5 Pro is the extended-thinking variant — it spends more compute thinking through hard problems before responding. Use it for: complex math (AIME-level competition problems), novel algorithm design, multi-step research questions, and ambiguous specifications where standard GPT-5.5 makes mistakes. It costs 3× standard GPT-5.5 ($15/$60 vs $5/$30) and takes 30 seconds to several minutes per response. For routine code and chat, GPT-5.5 standard is faster and 90% as good. Available only through ChatGPT Pro ($200/mo), API at higher rate limits, and via the o-series successor pipeline.
What is GPT-5.5 Instant?
GPT-5.5 Instant is the fastest, cheapest tier — optimized for sub-200ms latency at $1.50/$6 per Mtok. It uses smaller activated parameters (similar to o4-mini-instant) but inherits GPT-5.5's knowledge cutoff and tool-use behavior. Use it for: chat assistants, autocomplete, classification, simple Q&A, and any latency-sensitive UI. Quality is roughly 80-85% of standard GPT-5.5 on most benchmarks. Available via API as gpt-5.5-instant and selectable in the ChatGPT model picker for free-tier users.
How big is GPT-5.5's context window?
400,000 tokens — about 4× larger than GPT-5's 128K and 2× larger than Claude Sonnet 5's 200K. Output is up to 32,000 tokens per response. With prompt caching enabled (50% off cached input), large-context use cases like whole-codebase analysis become economically practical. Still smaller than Gemini 3.1 Pro's 1M-token context. For workloads that need more than 400K, the typical pattern is to split work across calls or use Gemini 3.1 Pro for the analysis stage and GPT-5.5 for code generation.
When should I use an open-weight model instead of GPT-5.5?
Use an open-weight alternative when: 1) Your data cannot leave your network (regulated industries, defense, IP-sensitive). 2) You need predictable monthly costs — one-time hardware ($3-15K) replaces ongoing $50-400/mo API bills. 3) You're building a product and don't want OpenAI to be your single point of failure. 4) Sub-100ms latency matters more than absolute peak quality. The closest open-weight matches: DeepSeek V4-Pro for general work, Qwen3-Coder-Next for coding-heavy workflows, GLM-5 for reasoning. Most engineering teams that adopt local models keep GPT-5.5 for the hardest 5-10% of cases and use local for everything else — typical cost reduction is 60-85%.

Cut your GPT-5.5 API bill 60-85%

The Local AI Master deployment course shows you how to run open-weight alternatives like DeepSeek V4 and Qwen3-Coder-Next on your own hardware. Use them for the routine 80% of traffic, keep GPT-5.5 for the hard 20%.

See the deployment course →

Related models

📚
Free · no account required

Grab the AI Starter Kit — career roadmap, cheat sheet, setup guide

No spam. Unsubscribe with one click.

🎯
AI Learning Path

Go from reading about AI to building with AI

10 structured courses. Hands-on projects. Runs on your machine. Start free.

Free Tools & Calculators