★ Reading this for free? Get 17 structured AI courses + per-chapter AI tutor — the first chapter of every course free, no card.Start free in 30 seconds
Image Generation

Automatic1111 Stable Diffusion WebUI Complete Guide (2026): Install, Extensions, ControlNet

May 1, 2026
26 min read
LocalAimaster Research Team

Want to go deeper than this article?

Free account unlocks the first chapter of all 17 courses — RAG, agents, MCP, voice AI, MLOps, real GitHub repos.

Automatic1111 is the original and still most-extended Stable Diffusion web UI. While ComfyUI has overtaken it for advanced workflows and Forge is faster on the same hardware, A1111 remains the default starting point for anyone new to local image generation — the largest extension ecosystem, the most familiar tabbed interface, and the most curated community tutorials live here. For SDXL, SD 1.5, Pony, and Illustrious work, A1111 is still excellent.

This guide covers everything: installation across NVIDIA / AMD / Apple, the txt2img / img2img / Extras / Train / Settings tabs, ControlNet integration, LoRA stacks and embeddings, inpainting and outpainting, upscaling pipelines, must-have extensions, X/Y/Z parameter exploration, the API mode, and tuning for common GPUs.

Table of Contents

  1. What Automatic1111 Is
  2. A1111 vs Forge vs Fooocus vs ComfyUI
  3. Hardware Requirements
  4. Installation: Windows, Linux, Mac
  5. Folder Layout
  6. Your First Generation
  7. Models and Checkpoints
  8. LoRAs and Embeddings
  9. Sampler Reference
  10. ControlNet Setup and Use
  11. Inpainting and Outpainting
  12. Upscaling Pipeline
  13. X/Y/Z Plot for Parameter Exploration
  14. Hires Fix and Refiner
  15. Must-Have Extensions
  16. The API Mode
  17. AMD GPU Setup
  18. Apple Silicon Setup
  19. Tuning Recipes by GPU
  20. Troubleshooting
  21. FAQ

Reading articles is good. Building is better.

Free account = 17+ structured chapters across 17 courses, with a per-chapter AI tutor. No card. Cancel anytime if you ever upgrade.

What Automatic1111 Is {#what-it-is}

A1111 is a Gradio-based web UI for Stable Diffusion. Released August 2022 (days after SD 1.4), it became the de facto local interface for SD throughout 2022-2024. Maintainer goes by AUTOMATIC1111 on GitHub. Project: github.com/AUTOMATIC1111/stable-diffusion-webui. License: AGPL-3.0.

Core features:

  • txt2img, img2img, inpainting, outpainting
  • 5,000+ community extensions
  • ControlNet via the Mikubill extension
  • LoRA stacks, textual inversion embeddings, hypernetworks
  • X/Y/Z parameter sweeps
  • Built-in upscaling (ESRGAN, RealESRGAN, SwinIR, etc.)
  • Hires Fix for cheap upscale-then-resample workflows
  • Refiner support (SDXL Base + Refiner)
  • Train tab for textual inversion / hypernetwork training
  • API mode for programmatic use

A1111 vs Forge vs Fooocus vs ComfyUI {#comparison}

PropertyA1111ForgeFooocusComfyUI
UXTabbed UISame as A1111 (fork)SimplifiedNode graph
PerformanceBaseline30-60% fasterFooocus-tunedFastest
Extension count5,000+A1111-compatible mostlyLimitedDifferent ecosystem
Flux DevLimitedNativeLimitedNative
SD 3.51.10+NativeNativeNative
VideoNoneNoneNoneNative
Best forBeginners + ecosystem depthA1111 users wanting speedQuick "good image"Power users

For Flux / SD 3.5 / video: ComfyUI or Forge. For SDXL + LoRA + ControlNet stacks with maximum extension support: A1111. For "I just want SDXL to work": Fooocus.


Hardware Requirements {#requirements}

GPU VRAMCapability
4 GBSD 1.5 only with low-VRAM flags
6-8 GBSDXL with --medvram, SD 1.5 comfortable
12 GBSDXL with refiner, SD 1.5 fast
16 GBFlux Schnell, SD 3.5 Medium
24 GBFlux Dev (FP8 / GGUF), all SDXL workflows

System RAM: 16 GB minimum, 32 GB recommended. Disk: 50 GB+ NVMe (models add up).


Reading articles is good. Building is better.

Free account = 17+ structured chapters across 17 courses, with a per-chapter AI tutor. No card. Cancel anytime if you ever upgrade.

Installation: Windows, Linux, Mac {#installation}

Windows

  1. Install Python 3.10.6 (must be exactly 3.10.x; A1111 doesn't officially support 3.11+).
  2. Install git.
  3. Clone:
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui
cd stable-diffusion-webui
  1. Run webui-user.bat. First run installs PyTorch + dependencies (5-15 minutes).
  2. Browser opens to http://localhost:7860.

Linux

git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui
cd stable-diffusion-webui
./webui.sh

Install Python 3.10 / 3.11 first if not present. The script creates a venv and installs PyTorch + dependencies.

macOS (Apple Silicon)

brew install cmake protobuf rust python@3.10 git wget
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui
cd stable-diffusion-webui
./webui.sh

Metal (MPS) is auto-detected. Performance on M4 Max: SDXL 1024² in ~18-25 seconds (vs 4 sec on RTX 4090). For better Mac performance, consider Draw Things (MLX-based, separate app).


Folder Layout {#folders}

stable-diffusion-webui/
├── models/
│   ├── Stable-diffusion/    # .safetensors checkpoints
│   ├── VAE/                 # VAE files
│   ├── Lora/                # LoRA files
│   ├── ControlNet/          # ControlNet models
│   ├── ESRGAN/              # Upscaler models
│   └── embeddings/          # Textual inversion embeddings
├── extensions/              # Installed extensions
├── outputs/                 # Generated images
└── webui-user.bat / .sh     # Launch script with custom args

To share models with ComfyUI without duplicating, edit extra_model_paths.yaml in ComfyUI to point at A1111 paths. See ComfyUI Complete Guide.


Your First Generation {#first-generation}

  1. Download a checkpoint, e.g., sd_xl_base_1.0.safetensors. Place in models/Stable-diffusion/.
  2. Launch A1111 → refresh checkpoint dropdown (top left) → select.
  3. txt2img tab → Prompt: "cinematic photo of a samurai in a misty forest, ultra-detailed, 35mm film".
  4. Negative prompt: "blurry, deformed, extra fingers, low quality, bad anatomy".
  5. Width: 1024, Height: 1024 (for SDXL).
  6. Sampling steps: 25, Sampler: DPM++ 2M Karras, CFG scale: 7.
  7. Click Generate.

Expected time on RTX 4090: ~3-5 sec.


Models and Checkpoints {#models}

Model familyBest forDefault sampler
SDXL Base 1.0General realisticDPM++ 2M Karras, 25 steps, CFG 7
SDXL LightningFast generationLCM, 8 steps, CFG 1.5
SDXL TurboReal-timeLCM, 1-4 steps, CFG 1
SD 1.5Older but fastest, huge LoRA libraryEuler a, 20 steps, CFG 7
SD 3.5 MediumPermissive license, strong promptsEuler, 28 steps, CFG 4.5
Pony Diffusion v6 XLAnime / characterDPM++ 2M Karras, 30 steps, CFG 7
Illustrious XLCleaner animeEuler a, 28 steps, CFG 5
Flux Schnell (Forge / extension)Fastest top-tierEuler, 4 steps, CFG 1
Flux Dev (Forge / extension)Best qualityEuler, 20 steps, CFG 1

For LoRA-heavy work: SD 1.5 has the largest LoRA library; SDXL is catching up; Pony / Illustrious have anime ecosystems.


LoRAs and Embeddings {#loras}

LoRAs

Place .safetensors LoRA files in models/Lora/. Use in prompt:

a samurai, <lora:my_style:0.8>, cinematic

The :0.8 is the strength (0.0-1.5 typical). Stack multiple:

a samurai, <lora:style_a:0.6>, <lora:character_b:0.7>, <lora:concept_c:0.4>

For a UI-managed LoRA picker, install sd-webui-additional-networks or use the built-in Lora tab (left sidebar in A1111 1.7+).

Embeddings (Textual Inversion)

Place .pt or .safetensors embeddings in embeddings/. Reference by filename:

masterpiece, beautiful landscape, embedding:negative_easy in negative

Common negative embeddings: negative_easy, bad-hands-5, unaestheticXL, ng_deepnegative.


Sampler Reference {#samplers}

A1111 ships ~25 samplers. Most-used:

SamplerUse Case
DPM++ 2M KarrasSDXL default, balanced
DPM++ 2M SDE KarrasSlightly higher quality, slower
Euler aSD 1.5 fast
EulerSD 3.5, Flux
LCMLCM models, 1-8 steps
DDIMReproducible, used in research
UniPCFewer steps, good quality
DPM++ 3M SDE KarrasHigh quality, very slow

For most users: DPM++ 2M Karras at 25 steps with CFG 7 is the SDXL default. For LCM / Lightning models: LCM at 8 steps with CFG 1.5. For Flux: Euler at 20 steps with CFG 1.


ControlNet Setup and Use {#controlnet}

Install

Extensions tab → Install from URL → https://github.com/Mikubill/sd-webui-controlnet → Install. Restart.

Download models

For SD 1.5: control_v11p_*.safetensors family from lllyasviel's repo. For SDXL: controlnet-canny-sdxl, controlnet-depth-sdxl, controlnet-openpose-sdxl, etc. from xinsir / kohya / Diffusers.

Place in models/ControlNet/.

Use

In txt2img: expand ControlNet section → enable → drop reference image → choose preprocessor (canny / openpose / depth / etc.) → choose model. Set Control Weight (0.0-2.0, default 1.0). Generate.

Multiple ControlNet units can stack — A1111 supports up to 3 simultaneous by default (configurable).

For deeper ControlNet workflows: see ComfyUI Complete Guide.


Inpainting and Outpainting {#inpainting}

Inpainting

img2img tab → Inpaint sub-tab → upload image → paint mask. Set:

  • Mask blur: 4-8 px for soft edges
  • Mask mode: Inpaint masked
  • Masked content: original (preserve) or fill (replace)
  • Denoising strength: 0.5-0.9 (higher = more change)
  • Inpaint area: Whole picture vs Only masked

For best results use an inpainting-specific checkpoint (sd_xl_base_1.0_inpainting_0.1.safetensors).

Outpainting

Use the Outpainting mk2 or Poor man's outpainting scripts from the Script dropdown, or the canvas-zoom extension for canvas-based outpainting.


Upscaling Pipeline {#upscaling}

Two stages typically:

  1. Latent upscale (cheap) via Hires Fix in txt2img.
  2. Image upscaler (sharp) via Extras tab using ESRGAN / RealESRGAN / SwinIR / 4x-UltraSharp.

Hires Fix recipe:

  • Enable Hires Fix in txt2img
  • Upscaler: R-ESRGAN 4x+ or Latent (nearest-exact)
  • Hires steps: 15
  • Denoising strength: 0.5
  • Upscale by: 2.0

For ultra-large images (4K+), iterative upscale: 1024 → 1536 → 2048 → 3072 with low-denoise resampling each step.


X/Y/Z Plot for Parameter Exploration {#xyz-plot}

Bottom of txt2img tab → Script → "X/Y/Z plot."

Example: find best sampler+steps combo for a checkpoint:

  • X type: Sampler, X values: DPM++ 2M Karras, Euler a, DPM++ 3M SDE Karras
  • Y type: Sampling steps, Y values: 15, 25, 40

Generate. A1111 produces a 3x3 grid showing every combination + the composite image.

Other useful axes: Prompt S/R (search-replace), CFG scale, LoRA strength, Seed.


Hires Fix and Refiner {#hires}

Hires Fix: generate at low resolution then upscale-and-resample at higher resolution. Saves VRAM, often higher quality than direct hi-res generation.

Refiner (SDXL): SDXL ships with a separate Refiner model that polishes the last few steps. Enable in the Refiner section of txt2img. Use base for first 80% steps, refiner for last 20%. Improves detail, slightly slower.


Must-Have Extensions {#extensions}

ExtensionPurpose
sd-webui-controlnetControlNet — essential
adetailerAuto-fix faces / hands
sd-dynamic-promptsWildcard / template prompts
sd-webui-additional-networksMulti-LoRA UI
sd-webui-rembgBackground removal
sd-webui-segment-anythingSAM-based masking
canvas-zoomBetter inpaint UX
a1111-sd-webui-tagcompleteBooru-style autocomplete
sd-webui-aspect-ratio-helperQuick aspect ratio buttons
stable-diffusion-webui-images-browserOutput gallery
sd-webui-roop / faceswapFace swap (use ethically)
multidiffusion-upscaler-for-automatic1111Tiled VAE for very large outputs

Install via Extensions tab → Available → Load from. Restart after each install.


The API Mode {#api}

Launch with --api --listen:

./webui.sh --api --listen

Endpoints (browse /docs):

  • POST /sdapi/v1/txt2img
  • POST /sdapi/v1/img2img
  • POST /sdapi/v1/extra-single-image (upscale)
  • GET /sdapi/v1/sd-models
  • POST /sdapi/v1/options
  • GET /sdapi/v1/progress

Example (Python):

import requests, base64
resp = requests.post("http://localhost:7860/sdapi/v1/txt2img", json={
    "prompt": "a samurai", "steps": 25, "width": 1024, "height": 1024
})
img_b64 = resp.json()["images"][0]

Compatible with the SD WebUI API client for ComfyUI, sd-webui-api-python, and most "self-hosted Midjourney" alternative apps.


AMD GPU Setup {#amd}

# Use the AMD-friendly fork
git clone https://github.com/lshqqytiger/stable-diffusion-webui-amdgpu
cd stable-diffusion-webui-amdgpu

# Install ROCm 6.x first (see AMD ROCm guide)
# Then run
./webui.sh --listen

For older AMD cards on Vulkan, use the DirectML or Vulkan backend instead — ./webui.sh --use-directml on Windows or community Vulkan forks.

See AMD ROCm Setup for Local LLMs for ROCm fundamentals.


Apple Silicon Setup {#mac}

Native upstream A1111 works via MPS. Performance is 3-5x slower than NVIDIA on most tasks. For better Mac performance: Draw Things (MLX-native), Mochi Diffusion (Core ML), or the SwiftCoreML fork.


Tuning Recipes by GPU {#tuning}

RTX 3060 12 GB

# webui-user.bat / .sh
COMMANDLINE_ARGS=--medvram --xformers --listen

RTX 4090 24 GB

COMMANDLINE_ARGS=--xformers --listen --api

RX 7900 XTX (via amdgpu fork)

COMMANDLINE_ARGS=--listen --api --opt-sub-quad-attention

Mac M4 Max

COMMANDLINE_ARGS=--no-half-vae --listen

Tight VRAM (6-8 GB)

COMMANDLINE_ARGS=--lowvram --xformers

--medvram and --lowvram trade speed for memory; --xformers is fastest attention on NVIDIA.


Troubleshooting {#troubleshooting}

SymptomCauseFix
Black imagesNaN in VAEAdd --no-half-vae
OOMNot enough VRAMUse --medvram or smaller resolution
xformers fails to installCUDA mismatchUse --opt-sdp-attention instead
Slow on first runLazy import / model loadSubsequent runs faster
Extension breaks UIConflictDisable in Extensions tab → restart
Controlnet has no effectWrong base modelSDXL ControlNet on SDXL only
Mac: incompatible PyTorchWrong PythonUse Python 3.10 only

FAQ {#faq}

See answers to common Automatic1111 questions below.


Sources: Automatic1111 GitHub | sd-webui-controlnet | r/StableDiffusion | Internal benchmarks RTX 3060, 4090, RX 7900 XTX, M4 Max.

Related guides:

🎯
AI Learning Path

Go from reading about AI to building with AI

10 structured courses. Hands-on projects. Runs on your machine. Start free.

Liked this? 17 full AI courses are waiting.

From fundamentals to RAG, agents, MCP servers, voice AI, and production deployment with real GitHub repos. First chapter free, every course.

Reading now
Join the discussion

LocalAimaster Research Team

Creator of Local AI Master. I've built datasets with over 77,000 examples and trained AI models from scratch. Now I help people achieve AI independence through local AI mastery.

Build Real AI on Your Machine

RAG, agents, NLP, vision, and MLOps - chapters across 17 courses that take you from reading about AI to building AI.

Want structured AI education?

17 courses, 160+ chapters, from $9. Understand AI, don't just use it.

AI Learning Path

Comments (0)

No comments yet. Be the first to share your thoughts!

📅 Published: May 1, 2026🔄 Last Updated: May 1, 2026✓ Manually Reviewed

Bonus kit

Ollama Docker Templates

10 one-command Docker stacks. Includes A1111 + Ollama image-gen pipeline reference. Included with paid plans, or free after subscribing to both Local AI Master and Little AI Master on YouTube.

See Plans →

Build Real AI on Your Machine

RAG, agents, NLP, vision, and MLOps - chapters across 17 courses that take you from reading about AI to building AI.

Was this helpful?

PR

Written by Pattanaik Ramswarup

Creator of Local AI Master

I build Local AI Master around practical, testable local AI workflows: model selection, hardware planning, RAG systems, agents, and MLOps. The goal is to turn scattered tutorials into a structured learning path you can follow on your own hardware.

✓ Local AI Curriculum✓ Hands-On Projects✓ Open Source Contributor
📚
Free · no account required

Grab the AI Starter Kit — career roadmap, cheat sheet, setup guide

No spam. Unsubscribe with one click.

🎯
AI Learning Path

Go from reading about AI to building with AI

10 structured courses. Hands-on projects. Runs on your machine. Start free.

Free Tools & Calculators