★ Reading this for free? Get 20 structured AI courses + per-chapter AI tutor — the first chapter of every course free, no card.Start free in 30 seconds
Image Generation

Local AI Image Upscaling (2026): ESRGAN, GFPGAN & 4x

June 20, 2026
11 min read
Local AI Master Research Team

Want to go deeper than this article?

Free account unlocks the first chapter of all 20 courses — RAG, agents, MCP, voice AI, MLOps, real GitHub repos.

📚AI Learning Path

Go from reading about AI to building with AI 20 structured courses. Hands-on projects. Runs on your machine. Start free.

Start free
Or own it for life — Lifetime $149, pay once

The best local AI image upscaler in 2026 is Real-ESRGAN (the open-source xinntao project) for general photos and a 4x ESRGAN model like 4x-UltraSharp for AI art — both run free and fully offline, need only ~2-4 GB of VRAM, and do 4x or 2x upscaling that rivals paid cloud tools. Pair them with GFPGAN v1.4 or CodeFormer to fix faces, and SwinIR when you need denoising and JPEG-artifact removal alongside the upscale. The honest trade-off versus Topaz Gigapixel: local tools are free and private but ask you to pick the right model per image; Topaz is one-click but, as of its October 2025 switch, subscription-only (Gigapixel is about $29/month or $149/year).

If you already generate images with Stable Diffusion or Flux, upscaling is the missing finishing step — it turns a 512x512 or 1024x1024 generation into a clean 2K-4K print without re-rolling the prompt. This guide covers the models that matter, the ComfyUI and Forge nodes that drive them, how little VRAM you actually need, and when a cloud upscaler is still worth paying for.

What is the best local AI image upscaler in 2026?

There is no single winner — the right model depends on the image. Here is the short version, then the detail below:

  • Photos and mixed content: Real-ESRGAN (RealESRGAN_x4plus) — the most widely used, most robust general 4x upscaler.
  • AI art / illustrations / sharp detail: a 4x ESRGAN-architecture model such as 4x-UltraSharp.
  • Anime and line art: RealESRGAN_x4plus_anime_6B (a smaller, anime-tuned 6-block model).
  • Restoration (denoise + de-JPEG + upscale): SwinIR, a Swin-Transformer restoration model.
  • Faces: GFPGAN v1.4 or CodeFormer as a second pass, never as the primary upscaler.

The reason you keep several around is that upscalers are specialists. A model trained to sharpen photographic texture will invent crunchy "hair" detail on a smooth illustration; an anime model will smear photographic skin. Keeping three or four model files on disk (each is roughly 60-350 MB) lets you match the tool to the image, which is the single biggest quality lever in local upscaling.

Reading articles is good. Building is better.

Free account = 20+ free chapters across 20 courses, with a per-chapter AI tutor. No card. Cancel anytime if you ever upgrade.

Local upscaling models compared (VRAM, scale, best use)

The table below lists the models worth keeping, with their real architectures, typical scale factors, approximate file sizes, and what each is for. ESRGAN-family models (Real-ESRGAN, 4x-UltraSharp, Remacri, etc.) are interchangeable in the same "load upscale model" node — they are all the same network shape with different training data.

ModelArchitectureNative scaleFile size (approx)Best for
Real-ESRGAN x4plusESRGAN (RRDBNet)4x (2x variant exists)~64 MBGeneral photos, mixed content
Real-ESRGAN x4plus_anime_6BESRGAN (6-block)4x~18 MBAnime, line art, flat color
realesr-general-x4v3ESRGAN (tiny)4x~5 MBLow-VRAM / fast general use
4x-UltraSharpESRGAN4x~67 MBAI art, crisp edges, illustration
SwinIRSwin Transformer2x / 4x~60-140 MB (variant-dependent)Restoration: denoise + de-JPEG + SR
GFPGAN v1.4StyleGAN2 face priorface-only pass~349 MBRepairing faces after upscale
CodeFormerCodebook Transformerface-only pass~360 MBFaces with adjustable fidelity

A few honest notes. Real-ESRGAN, 4x-UltraSharp and the other ESRGAN models all upscale at a fixed integer factor (usually 4x); if you want 2x, you upscale 4x and then downscale, or use the dedicated RealESRGAN_x2plus weights. GFPGAN and CodeFormer are not general upscalers at all — they only reconstruct faces, so you run them after a normal upscale (or with Real-ESRGAN's built-in --face_enhance flag, which calls GFPGAN under the hood).

How much VRAM does local upscaling need?

This is the good news: upscaling is far lighter than image generation. An ESRGAN-family model like Real-ESRGAN runs comfortably in roughly 2-4 GB of VRAM, and the tiny realesr-general-x4v3 runs in well under 2 GB. That means upscaling works on hardware that struggles to run Stable Diffusion XL, and it even runs acceptably on CPU (just slower).

The thing that blows up memory is not the model — it is the output resolution. Upscaling a 1024x1024 image by 4x produces a 4096x4096 result, and holding that whole tensor in VRAM is what causes out-of-memory errors. The fix is tiling: the image is split into small tiles (e.g. 512x512), each is upscaled, and the tiles are stitched back. Real-ESRGAN exposes this with a --tile option, and in ComfyUI the UltimateSDUpscale node automatically tiles and encodes in blocks when VRAM runs short (it warns that this is slower, which is the expected trade-off).

TaskPractical VRAMNotes
ESRGAN 4x, image fits in memory~2-4 GBReal-ESRGAN, 4x-UltraSharp, etc.
ESRGAN 4x to very large output~2-4 GB with tilingTile size 512 keeps memory flat
SwinIR restoration~4-6 GBTransformer, heavier than ESRGAN
Face restore (GFPGAN/CodeFormer)~2-3 GBRuns on a cropped face region
No GPU at allCPU worksMinutes per image instead of seconds

On my own machine (an RTX 3090, 24 GB) a single 4x Real-ESRGAN pass on a 1024x1024 image to 4096x4096 finishes in roughly 1-2 seconds, and batch-upscaling a folder of 100 images runs unattended in a couple of minutes. Treat those as approximate, single-machine numbers — the point is that upscaling is fast and cheap on local hardware, not that your exact times will match.

How does ComfyUI handle upscaling? (latent vs model upscale)

ComfyUI gives you two fundamentally different ways to make an image bigger, and mixing them up is the most common beginner mistake. For the full node-graph basics, start with our ComfyUI complete guide; the upscaling-specific distinction is this:

  1. Model upscale (pixel space) — the Load Upscale Model + Upscale Image (Using Model) nodes. You feed in a finished, decoded image and an ESRGAN model (Real-ESRGAN, 4x-UltraSharp). It intelligently reconstructs detail at 4x. This is the everyday upscaler and the one you want for an already-rendered image.
  2. Latent upscale — the Upscale Latent / Upscale Latent By nodes operate on the latent tensor before it is decoded to pixels, mid-generation. It is faster and keeps generation coherence, but it is not a detail-adding super-resolution model on its own — you typically follow it with a second sampler pass at low denoise so the model "repaints" the new resolution. Run a latent upscale on an already-decoded image and you have just enlarged pixels, not added detail.

For the best of both, the UltimateSDUpscale node (the ComfyUI port of Coyote-A's Ultimate SD Upscale) combines them: it upscales with an ESRGAN model, splits the result into tiles, and runs a low-denoise img2img pass on each tile to add genuine new detail. Because it tiles, it finishes large outputs on limited VRAM — the official guidance is that if VRAM is short it auto-tiles and encodes in blocks, just more slowly.

Reading articles is good. Building is better.

Free account = 20+ free chapters across 20 courses, with a per-chapter AI tutor. No card. Cancel anytime if you ever upgrade.

How do you upscale in Forge or Automatic1111?

If you prefer a classic WebUI, both Automatic1111 and the faster Forge fork expose two routes. For setup, see our Automatic1111 guide and the leaner SD Forge guide.

  • Hires. fix (txt2img): built into the generation tab. It renders at your base size, upscales (you choose an upscaler like R-ESRGAN 4x+ or a 4x ESRGAN model), then runs a second diffusion pass at a denoise you set (~0.3-0.5) to sharpen. It is the simplest "make my generation bigger and better" toggle.
  • Extras tab: a pure, non-diffusion upscale. Drop any image in, pick an upscaler (Real-ESRGAN, SwinIR), choose a scale, and it runs the model directly with optional GFPGAN/CodeFormer face restoration. Good for upscaling photos or images you did not generate.
  • Ultimate SD Upscale (extension): Coyote-A's tile-and-inpaint script for big, detailed results on any GPU. It breaks the image into 512x512 tiles and runs img2img at high denoise (0.3-0.5) per tile, producing fewer seams than the old built-in "SD upscale." One caveat to know going in: some users report it does not activate correctly under Forge and falls back to plain img2img, while it works reliably in A1111 — test it on your build before trusting it for a batch.

Whether you generate with Stable Diffusion or with Flux locally, the upscale step is identical: produce the base image, then run it through one of these passes.

How do you fix faces after upscaling?

General upscalers reconstruct texture, not identity, so small or low-quality faces often come out smudged. That is what GFPGAN and CodeFormer are for — they are face-specific restorers you run after the main upscale.

  • GFPGAN v1.4 uses a StyleGAN2 facial prior to rebuild realistic faces; v1.4 produces slightly more detail and better identity than v1.3. It is the simplest one-shot option and is what Real-ESRGAN's --face_enhance flag invokes automatically.
  • CodeFormer adds a controllable fidelity weight (w from 0 to 1): a lower w yields higher visual quality (the model invents more), a higher w yields higher fidelity to the original face (better identity, less invention). That knob makes CodeFormer the better choice when preserving someone's actual likeness matters — old-photo restoration, for instance.

A practical rule: start with CodeFormer at w ≈ 0.5-0.7 for real people, and reach for GFPGAN when you just want a quick, good-looking face on AI-generated portraits. Both run on a cropped face region, so they need only ~2-3 GB of VRAM and add a second or two per image.

Photo restoration: a real local use case

Beyond making AI art bigger, the same toolchain restores damaged real photos — old scans, blurry phone shots, JPEG-mangled images. A reliable local pipeline is: SwinIR (or Real-ESRGAN) to denoise and upscale → CodeFormer to repair faces. SwinIR is purpose-built here: the official paper reports state-of-the-art results on real-world super-resolution, denoising, and JPEG-artifact reduction, beating prior methods by up to 0.14-0.45 dB while using up to 67% fewer parameters. Because everything runs locally, you can restore family photos without uploading them to a stranger's server — a privacy win cloud tools cannot match.

4x vs 2x: which scale should you pick?

Always upscale by the smallest factor that hits your target resolution. A 4x pass invents the most new detail, but on a clean source it can also over-sharpen and add texture that was not there. Guidance:

  • 2x when the source is already fairly large or high-quality (e.g. a 1024px AI render going to 2K). Less invention, more faithful.
  • 4x when the source is small or you need a big print (e.g. a 512px image to 2048px). More reconstruction, watch for artifacts on flat areas.
  • Chained / iterative (2x then 2x, or model-upscale then a low-denoise diffusion tile pass) when you want maximum size with controlled detail — this is exactly what Ultimate SD Upscale automates.

Local upscalers vs Topaz Gigapixel (cost and privacy)

The obvious commercial comparison is Topaz Gigapixel AI. It is genuinely excellent and one-click, but two facts changed the math in late 2025: Topaz retired perpetual licenses in October 2025 and moved to subscription-only pricing — a standalone Gigapixel subscription runs about $29/month or $149/year (a Pro tier is $499/year, and Gigapixel is also bundled in the broader Topaz Studio subscription). The old one-time ~$99 Gigapixel license is gone for new buyers.

FactorLocal (Real-ESRGAN, SwinIR, etc.)Topaz Gigapixel AI
CostFree, open-source~$29/mo or $149/year (subscription since Oct 2025)
Privacy100% offline, images never leave your PCLocal app, but paid + account-bound
EasePick the right model per imageOne-click, auto model selection
VRAM~2-4 GB, runs on modest GPUs/CPUOptimized desktop app
BatchFree, unlimited, scriptableIncluded
UpdatesCommunity-drivenSubscription only

The honest verdict: if you upscale occasionally and value zero cost plus full privacy, the local stack wins outright and runs on hardware you already own. If you upscale professionally at volume and want a polished one-click result without choosing models, Topaz's subscription can be worth it. Many people do both — local for everyday AI-art finishing, Topaz for a handful of client-grade restorations.

Key Takeaways

  1. Real-ESRGAN is the best general local upscaler in 2026, with 4x-UltraSharp for AI art and the anime 6B model for line art. They are free, offline, and need only ~2-4 GB of VRAM.
  2. Upscaling is much lighter than generation. The model is small; the output resolution is what eats memory — use tiling (Real-ESRGAN --tile or UltimateSDUpscale) to keep large outputs within VRAM.
  3. In ComfyUI, "model upscale" (pixel) and "latent upscale" are different tools. Use model upscale for finished images; latent upscale belongs mid-generation, followed by a sampler pass.
  4. GFPGAN v1.4 and CodeFormer fix faces after the main upscale. CodeFormer's fidelity weight (lower = more invention, higher = more identity) makes it the pick for restoring real people.
  5. Local beats Topaz on cost and privacy. Topaz dropped perpetual licenses in Oct 2025 and Gigapixel is now subscription-only (~$29/mo or $149/year); the local stack is free and never uploads your images.

Next Steps

🎯
AI Learning Path

Go from reading about AI to building with AI

20 structured courses. Hands-on projects. Runs on your machine. Start free.

Or own it for life — Lifetime $149 $599, pay once

Liked this? 20 full AI courses are waiting.

From fundamentals to RAG, agents, MCP servers, voice AI, and production deployment with real GitHub repos. First chapter free, every course.

Reading now
Join the discussion

Local AI Master Research Team

Creator of Local AI Master. I've built datasets with over 77,000 examples and trained AI models from scratch. Now I help people achieve AI independence through local AI mastery.

Build Real AI on Your Machine

RAG, agents, NLP, vision, and MLOps - chapters across 20 courses that take you from reading about AI to building AI.

Want structured AI education?

20 courses, 495+ chapters, from $9. Understand AI, don't just use it.

AI Learning Path

Comments (0)

No comments yet. Be the first to share your thoughts!

📅 Published: June 20, 2026🔄 Last Updated: June 20, 2026✓ Manually Reviewed

Ready to Go Beyond Tutorials?

20 structured courses with hands-on chapters - build RAG chatbots, AI agents, and ML pipelines on your own hardware.

🎯
AI Learning Path

Go from reading about AI to building with AI

20 structured courses. Hands-on projects. Runs on your machine. Start free.

Or own it for life — Lifetime $149 $599, pay once

Was this helpful?

LM

Written by the Local AI Master Team

The team behind Local AI Master

We build Local AI Master around practical, testable local AI workflows: model selection, hardware planning, RAG systems, agents, and MLOps. The goal is to turn scattered tutorials into a structured learning path you can follow on your own hardware.

✓ Local AI Curriculum✓ Hands-On Projects✓ Open Source Contributor
📚
Free · no account required

Grab the AI Starter Kit — career roadmap, cheat sheet, setup guide

No spam. Unsubscribe with one click.

🎯
AI Learning Path

Go from reading about AI to building with AI

20 structured courses. Hands-on projects. Runs on your machine. Start free.

Or own it for life — Lifetime $149 $599, pay once
Free Tools & Calculators