Automatic1111 Stable Diffusion WebUI Complete Guide (2026): Install, Extensions, ControlNet
Want to go deeper than this article?
Free account unlocks the first chapter of all 17 courses — RAG, agents, MCP, voice AI, MLOps, real GitHub repos.
Automatic1111 is the original and still most-extended Stable Diffusion web UI. While ComfyUI has overtaken it for advanced workflows and Forge is faster on the same hardware, A1111 remains the default starting point for anyone new to local image generation — the largest extension ecosystem, the most familiar tabbed interface, and the most curated community tutorials live here. For SDXL, SD 1.5, Pony, and Illustrious work, A1111 is still excellent.
This guide covers everything: installation across NVIDIA / AMD / Apple, the txt2img / img2img / Extras / Train / Settings tabs, ControlNet integration, LoRA stacks and embeddings, inpainting and outpainting, upscaling pipelines, must-have extensions, X/Y/Z parameter exploration, the API mode, and tuning for common GPUs.
Table of Contents
- What Automatic1111 Is
- A1111 vs Forge vs Fooocus vs ComfyUI
- Hardware Requirements
- Installation: Windows, Linux, Mac
- Folder Layout
- Your First Generation
- Models and Checkpoints
- LoRAs and Embeddings
- Sampler Reference
- ControlNet Setup and Use
- Inpainting and Outpainting
- Upscaling Pipeline
- X/Y/Z Plot for Parameter Exploration
- Hires Fix and Refiner
- Must-Have Extensions
- The API Mode
- AMD GPU Setup
- Apple Silicon Setup
- Tuning Recipes by GPU
- Troubleshooting
- FAQ
Reading articles is good. Building is better.
Free account = 17+ structured chapters across 17 courses, with a per-chapter AI tutor. No card. Cancel anytime if you ever upgrade.
What Automatic1111 Is {#what-it-is}
A1111 is a Gradio-based web UI for Stable Diffusion. Released August 2022 (days after SD 1.4), it became the de facto local interface for SD throughout 2022-2024. Maintainer goes by AUTOMATIC1111 on GitHub. Project: github.com/AUTOMATIC1111/stable-diffusion-webui. License: AGPL-3.0.
Core features:
- txt2img, img2img, inpainting, outpainting
- 5,000+ community extensions
- ControlNet via the Mikubill extension
- LoRA stacks, textual inversion embeddings, hypernetworks
- X/Y/Z parameter sweeps
- Built-in upscaling (ESRGAN, RealESRGAN, SwinIR, etc.)
- Hires Fix for cheap upscale-then-resample workflows
- Refiner support (SDXL Base + Refiner)
- Train tab for textual inversion / hypernetwork training
- API mode for programmatic use
A1111 vs Forge vs Fooocus vs ComfyUI {#comparison}
| Property | A1111 | Forge | Fooocus | ComfyUI |
|---|---|---|---|---|
| UX | Tabbed UI | Same as A1111 (fork) | Simplified | Node graph |
| Performance | Baseline | 30-60% faster | Fooocus-tuned | Fastest |
| Extension count | 5,000+ | A1111-compatible mostly | Limited | Different ecosystem |
| Flux Dev | Limited | Native | Limited | Native |
| SD 3.5 | 1.10+ | Native | Native | Native |
| Video | None | None | None | Native |
| Best for | Beginners + ecosystem depth | A1111 users wanting speed | Quick "good image" | Power users |
For Flux / SD 3.5 / video: ComfyUI or Forge. For SDXL + LoRA + ControlNet stacks with maximum extension support: A1111. For "I just want SDXL to work": Fooocus.
Hardware Requirements {#requirements}
| GPU VRAM | Capability |
|---|---|
| 4 GB | SD 1.5 only with low-VRAM flags |
| 6-8 GB | SDXL with --medvram, SD 1.5 comfortable |
| 12 GB | SDXL with refiner, SD 1.5 fast |
| 16 GB | Flux Schnell, SD 3.5 Medium |
| 24 GB | Flux Dev (FP8 / GGUF), all SDXL workflows |
System RAM: 16 GB minimum, 32 GB recommended. Disk: 50 GB+ NVMe (models add up).
Reading articles is good. Building is better.
Free account = 17+ structured chapters across 17 courses, with a per-chapter AI tutor. No card. Cancel anytime if you ever upgrade.
Installation: Windows, Linux, Mac {#installation}
Windows
- Install Python 3.10.6 (must be exactly 3.10.x; A1111 doesn't officially support 3.11+).
- Install git.
- Clone:
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui
cd stable-diffusion-webui
- Run
webui-user.bat. First run installs PyTorch + dependencies (5-15 minutes). - Browser opens to http://localhost:7860.
Linux
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui
cd stable-diffusion-webui
./webui.sh
Install Python 3.10 / 3.11 first if not present. The script creates a venv and installs PyTorch + dependencies.
macOS (Apple Silicon)
brew install cmake protobuf rust python@3.10 git wget
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui
cd stable-diffusion-webui
./webui.sh
Metal (MPS) is auto-detected. Performance on M4 Max: SDXL 1024² in ~18-25 seconds (vs 4 sec on RTX 4090). For better Mac performance, consider Draw Things (MLX-based, separate app).
Folder Layout {#folders}
stable-diffusion-webui/
├── models/
│ ├── Stable-diffusion/ # .safetensors checkpoints
│ ├── VAE/ # VAE files
│ ├── Lora/ # LoRA files
│ ├── ControlNet/ # ControlNet models
│ ├── ESRGAN/ # Upscaler models
│ └── embeddings/ # Textual inversion embeddings
├── extensions/ # Installed extensions
├── outputs/ # Generated images
└── webui-user.bat / .sh # Launch script with custom args
To share models with ComfyUI without duplicating, edit extra_model_paths.yaml in ComfyUI to point at A1111 paths. See ComfyUI Complete Guide.
Your First Generation {#first-generation}
- Download a checkpoint, e.g.,
sd_xl_base_1.0.safetensors. Place inmodels/Stable-diffusion/. - Launch A1111 → refresh checkpoint dropdown (top left) → select.
- txt2img tab → Prompt: "cinematic photo of a samurai in a misty forest, ultra-detailed, 35mm film".
- Negative prompt: "blurry, deformed, extra fingers, low quality, bad anatomy".
- Width: 1024, Height: 1024 (for SDXL).
- Sampling steps: 25, Sampler:
DPM++ 2M Karras, CFG scale: 7. - Click Generate.
Expected time on RTX 4090: ~3-5 sec.
Models and Checkpoints {#models}
| Model family | Best for | Default sampler |
|---|---|---|
| SDXL Base 1.0 | General realistic | DPM++ 2M Karras, 25 steps, CFG 7 |
| SDXL Lightning | Fast generation | LCM, 8 steps, CFG 1.5 |
| SDXL Turbo | Real-time | LCM, 1-4 steps, CFG 1 |
| SD 1.5 | Older but fastest, huge LoRA library | Euler a, 20 steps, CFG 7 |
| SD 3.5 Medium | Permissive license, strong prompts | Euler, 28 steps, CFG 4.5 |
| Pony Diffusion v6 XL | Anime / character | DPM++ 2M Karras, 30 steps, CFG 7 |
| Illustrious XL | Cleaner anime | Euler a, 28 steps, CFG 5 |
| Flux Schnell (Forge / extension) | Fastest top-tier | Euler, 4 steps, CFG 1 |
| Flux Dev (Forge / extension) | Best quality | Euler, 20 steps, CFG 1 |
For LoRA-heavy work: SD 1.5 has the largest LoRA library; SDXL is catching up; Pony / Illustrious have anime ecosystems.
LoRAs and Embeddings {#loras}
LoRAs
Place .safetensors LoRA files in models/Lora/. Use in prompt:
a samurai, <lora:my_style:0.8>, cinematic
The :0.8 is the strength (0.0-1.5 typical). Stack multiple:
a samurai, <lora:style_a:0.6>, <lora:character_b:0.7>, <lora:concept_c:0.4>
For a UI-managed LoRA picker, install sd-webui-additional-networks or use the built-in Lora tab (left sidebar in A1111 1.7+).
Embeddings (Textual Inversion)
Place .pt or .safetensors embeddings in embeddings/. Reference by filename:
masterpiece, beautiful landscape, embedding:negative_easy in negative
Common negative embeddings: negative_easy, bad-hands-5, unaestheticXL, ng_deepnegative.
Sampler Reference {#samplers}
A1111 ships ~25 samplers. Most-used:
| Sampler | Use Case |
|---|---|
| DPM++ 2M Karras | SDXL default, balanced |
| DPM++ 2M SDE Karras | Slightly higher quality, slower |
| Euler a | SD 1.5 fast |
| Euler | SD 3.5, Flux |
| LCM | LCM models, 1-8 steps |
| DDIM | Reproducible, used in research |
| UniPC | Fewer steps, good quality |
| DPM++ 3M SDE Karras | High quality, very slow |
For most users: DPM++ 2M Karras at 25 steps with CFG 7 is the SDXL default. For LCM / Lightning models: LCM at 8 steps with CFG 1.5. For Flux: Euler at 20 steps with CFG 1.
ControlNet Setup and Use {#controlnet}
Install
Extensions tab → Install from URL → https://github.com/Mikubill/sd-webui-controlnet → Install. Restart.
Download models
For SD 1.5: control_v11p_*.safetensors family from lllyasviel's repo.
For SDXL: controlnet-canny-sdxl, controlnet-depth-sdxl, controlnet-openpose-sdxl, etc. from xinsir / kohya / Diffusers.
Place in models/ControlNet/.
Use
In txt2img: expand ControlNet section → enable → drop reference image → choose preprocessor (canny / openpose / depth / etc.) → choose model. Set Control Weight (0.0-2.0, default 1.0). Generate.
Multiple ControlNet units can stack — A1111 supports up to 3 simultaneous by default (configurable).
For deeper ControlNet workflows: see ComfyUI Complete Guide.
Inpainting and Outpainting {#inpainting}
Inpainting
img2img tab → Inpaint sub-tab → upload image → paint mask. Set:
- Mask blur: 4-8 px for soft edges
- Mask mode: Inpaint masked
- Masked content: original (preserve) or fill (replace)
- Denoising strength: 0.5-0.9 (higher = more change)
- Inpaint area: Whole picture vs Only masked
For best results use an inpainting-specific checkpoint (sd_xl_base_1.0_inpainting_0.1.safetensors).
Outpainting
Use the Outpainting mk2 or Poor man's outpainting scripts from the Script dropdown, or the canvas-zoom extension for canvas-based outpainting.
Upscaling Pipeline {#upscaling}
Two stages typically:
- Latent upscale (cheap) via Hires Fix in txt2img.
- Image upscaler (sharp) via Extras tab using ESRGAN / RealESRGAN / SwinIR / 4x-UltraSharp.
Hires Fix recipe:
- Enable Hires Fix in txt2img
- Upscaler:
R-ESRGAN 4x+orLatent (nearest-exact) - Hires steps: 15
- Denoising strength: 0.5
- Upscale by: 2.0
For ultra-large images (4K+), iterative upscale: 1024 → 1536 → 2048 → 3072 with low-denoise resampling each step.
X/Y/Z Plot for Parameter Exploration {#xyz-plot}
Bottom of txt2img tab → Script → "X/Y/Z plot."
Example: find best sampler+steps combo for a checkpoint:
- X type: Sampler, X values:
DPM++ 2M Karras, Euler a, DPM++ 3M SDE Karras - Y type: Sampling steps, Y values:
15, 25, 40
Generate. A1111 produces a 3x3 grid showing every combination + the composite image.
Other useful axes: Prompt S/R (search-replace), CFG scale, LoRA strength, Seed.
Hires Fix and Refiner {#hires}
Hires Fix: generate at low resolution then upscale-and-resample at higher resolution. Saves VRAM, often higher quality than direct hi-res generation.
Refiner (SDXL): SDXL ships with a separate Refiner model that polishes the last few steps. Enable in the Refiner section of txt2img. Use base for first 80% steps, refiner for last 20%. Improves detail, slightly slower.
Must-Have Extensions {#extensions}
| Extension | Purpose |
|---|---|
| sd-webui-controlnet | ControlNet — essential |
| adetailer | Auto-fix faces / hands |
| sd-dynamic-prompts | Wildcard / template prompts |
| sd-webui-additional-networks | Multi-LoRA UI |
| sd-webui-rembg | Background removal |
| sd-webui-segment-anything | SAM-based masking |
| canvas-zoom | Better inpaint UX |
| a1111-sd-webui-tagcomplete | Booru-style autocomplete |
| sd-webui-aspect-ratio-helper | Quick aspect ratio buttons |
| stable-diffusion-webui-images-browser | Output gallery |
| sd-webui-roop / faceswap | Face swap (use ethically) |
| multidiffusion-upscaler-for-automatic1111 | Tiled VAE for very large outputs |
Install via Extensions tab → Available → Load from. Restart after each install.
The API Mode {#api}
Launch with --api --listen:
./webui.sh --api --listen
Endpoints (browse /docs):
POST /sdapi/v1/txt2imgPOST /sdapi/v1/img2imgPOST /sdapi/v1/extra-single-image(upscale)GET /sdapi/v1/sd-modelsPOST /sdapi/v1/optionsGET /sdapi/v1/progress
Example (Python):
import requests, base64
resp = requests.post("http://localhost:7860/sdapi/v1/txt2img", json={
"prompt": "a samurai", "steps": 25, "width": 1024, "height": 1024
})
img_b64 = resp.json()["images"][0]
Compatible with the SD WebUI API client for ComfyUI, sd-webui-api-python, and most "self-hosted Midjourney" alternative apps.
AMD GPU Setup {#amd}
# Use the AMD-friendly fork
git clone https://github.com/lshqqytiger/stable-diffusion-webui-amdgpu
cd stable-diffusion-webui-amdgpu
# Install ROCm 6.x first (see AMD ROCm guide)
# Then run
./webui.sh --listen
For older AMD cards on Vulkan, use the DirectML or Vulkan backend instead — ./webui.sh --use-directml on Windows or community Vulkan forks.
See AMD ROCm Setup for Local LLMs for ROCm fundamentals.
Apple Silicon Setup {#mac}
Native upstream A1111 works via MPS. Performance is 3-5x slower than NVIDIA on most tasks. For better Mac performance: Draw Things (MLX-native), Mochi Diffusion (Core ML), or the SwiftCoreML fork.
Tuning Recipes by GPU {#tuning}
RTX 3060 12 GB
# webui-user.bat / .sh
COMMANDLINE_ARGS=--medvram --xformers --listen
RTX 4090 24 GB
COMMANDLINE_ARGS=--xformers --listen --api
RX 7900 XTX (via amdgpu fork)
COMMANDLINE_ARGS=--listen --api --opt-sub-quad-attention
Mac M4 Max
COMMANDLINE_ARGS=--no-half-vae --listen
Tight VRAM (6-8 GB)
COMMANDLINE_ARGS=--lowvram --xformers
--medvram and --lowvram trade speed for memory; --xformers is fastest attention on NVIDIA.
Troubleshooting {#troubleshooting}
| Symptom | Cause | Fix |
|---|---|---|
| Black images | NaN in VAE | Add --no-half-vae |
| OOM | Not enough VRAM | Use --medvram or smaller resolution |
| xformers fails to install | CUDA mismatch | Use --opt-sdp-attention instead |
| Slow on first run | Lazy import / model load | Subsequent runs faster |
| Extension breaks UI | Conflict | Disable in Extensions tab → restart |
| Controlnet has no effect | Wrong base model | SDXL ControlNet on SDXL only |
| Mac: incompatible PyTorch | Wrong Python | Use Python 3.10 only |
FAQ {#faq}
See answers to common Automatic1111 questions below.
Sources: Automatic1111 GitHub | sd-webui-controlnet | r/StableDiffusion | Internal benchmarks RTX 3060, 4090, RX 7900 XTX, M4 Max.
Related guides:
Go from reading about AI to building with AI
10 structured courses. Hands-on projects. Runs on your machine. Start free.
Liked this? 17 full AI courses are waiting.
From fundamentals to RAG, agents, MCP servers, voice AI, and production deployment with real GitHub repos. First chapter free, every course.
Build Real AI on Your Machine
RAG, agents, NLP, vision, and MLOps - chapters across 17 courses that take you from reading about AI to building AI.
Want structured AI education?
17 courses, 160+ chapters, from $9. Understand AI, don't just use it.
Continue Your Local AI Journey
Comments (0)
No comments yet. Be the first to share your thoughts!