ComfyUI FLUX Workflow (2026): JSON Nodes Explained
Want to go deeper than this article?
Free account unlocks the first chapter of all 20 courses — RAG, agents, MCP, voice AI, MLOps, real GitHub repos.
Generating images locally? Take it further. From FLUX and ComfyUI setup to building real image pipelines and apps. First chapter free, no card.
A working ComfyUI FLUX.1 [dev] text-to-image workflow needs four loader nodes and a sampler chain: a Load Diffusion Model (UNETLoader) node pointing at flux1-dev.safetensors (set weight_dtype to fp8_e4m3fn if you are under ~16 GB VRAM), a DualCLIPLoader loading clip_l.safetensors + t5xxl_fp16.safetensors with type set to "flux", a VAELoader loading ae.safetensors, and then CLIP Text Encode → FluxGuidance (3.5) → KSampler with CFG fixed at 1.0. Drop the diffusion model in models/diffusion_models/ (older builds use models/unet/), both text encoders in models/text_encoders/ (older builds: models/clip/), and the VAE in models/vae/. The copy-paste workflow JSON is further down.
This is the FLUX-specific deep dive. If you have never opened ComfyUI before, start with our broader ComfyUI complete guide for the install and interface basics, then come back here for the FLUX node graph and its JSON internals.
What is the ComfyUI FLUX workflow, in one picture?
FLUX is not a single checkpoint like a Stable Diffusion 1.5 .safetensors file. It ships as three separate pieces — the diffusion transformer, the text encoders, and the VAE — and ComfyUI loads each one with its own node. That is the single biggest source of "it won't run" confusion for people coming from SD. There is no one Load Checkpoint node for FLUX; you wire three loaders.
Here is the full graph, in the order data flows through it:
| Stage | Node | What it loads / does | File |
|---|---|---|---|
| 1 | Load Diffusion Model (UNETLoader) | The FLUX transformer (the "model") | flux1-dev.safetensors |
| 2 | DualCLIPLoader | Both text encoders together | clip_l.safetensors + t5xxl_fp16.safetensors |
| 3 | VAELoader | The decoder that turns latents into pixels | ae.safetensors |
| 4 | CLIP Text Encode (Prompt) | Encodes your prompt with the dual CLIP | — |
| 5 | FluxGuidance | Sets FLUX's distilled guidance (≈3.5) | — |
| 6 | EmptyLatentImage | Blank canvas at your resolution | — |
| 7 | KSampler | Denoises — CFG must be 1.0 for FLUX | — |
| 8 | VAEDecode → SaveImage | Latent → image, then writes the PNG | — |
The mental model: model + two text encoders + VAE feed into a KSampler, the KSampler outputs a latent, and the VAE decodes that latent to a PNG. Everything below is just the details of each box.
Reading articles is good. Building is better.
Free account = 20+ free chapters across 20 courses, with a per-chapter AI tutor. No card. Cancel anytime if you ever upgrade.
Where do the FLUX files go in ComfyUI?
This is the step that breaks most first runs, so be exact. FLUX.1 [dev] is a 12B-parameter rectified-flow transformer from Black Forest Labs (released August 2024), and it expects these files in these folders:
| File | What it is | Folder (current ComfyUI) | Approx size |
|---|---|---|---|
| flux1-dev.safetensors | The 12B diffusion transformer | models/diffusion_models/ | ~23 GB (bf16) |
| clip_l.safetensors | CLIP-L text encoder | models/text_encoders/ | ~246 MB |
| t5xxl_fp16.safetensors | T5-XXL text encoder (fp16) | models/text_encoders/ | ~9.8 GB |
| t5xxl_fp8_e4m3fn.safetensors | T5-XXL text encoder (fp8, lighter) | models/text_encoders/ | ~4.9 GB |
| ae.safetensors | FLUX VAE (autoencoder) | models/vae/ | ~335 MB |
Two real-world gotchas worth saying out loud:
- Folder names changed. Newer ComfyUI uses
models/diffusion_models/andmodels/text_encoders/. Older installs (and a lot of tutorials) usemodels/unet/andmodels/clip/. ComfyUI still reads the legacy folders, so if your DualCLIPLoader dropdown is empty, your files are probably in the folder the other convention expects. Put them where your node's dropdown looks. - fp16 vs fp8 T5. The t5xxl text encoder is the heavy one. The fp16 version is ~9.8 GB; the fp8 version (t5xxl_fp8_e4m3fn) is ~4.9 GB and is what you want on a 8-12 GB card. Quality difference on prompts is small; VRAM difference is large.
You can download every one of these from the official ComfyUI model set; the file names above are the canonical ones used in the official ComfyUI FLUX.1 tutorial.
Node-by-node: the workflow JSON structure (UNETLoader, CLIP text encoder, sampler)
Here is each node explained as it appears in the workflow JSON, answering the literal question of how a FLUX graph is wired.
1. UNETLoader / Load Diffusion Model
In the ComfyUI menu this node is labelled Load Diffusion Model; in the JSON its class type is UNETLoader. It has two inputs you set by hand:
unet_name— the filename, e.g. flux1-dev.safetensorsweight_dtype— leave atdefaulton a 24 GB card; setfp8_e4m3fnto roughly halve the model's VRAM footprint on smaller cards (this casts the transformer weights to 8-bit on load)
This single dropdown is the most important low-VRAM lever in the whole graph. Switching weight_dtype from default to fp8_e4m3fn is what lets a 12B FLUX model squeeze onto a 12 GB GPU.
2. DualCLIPLoader
FLUX uses two text encoders at once, so it has a dedicated loader. Class type DualCLIPLoader, three fields:
clip_name1— t5xxl_fp16.safetensors (or the fp8 variant)clip_name2— clip_l.safetensorstype— set this to flux (not sdxl, not sd3). Wrong type here is the #1 cause of garbled output.
3. VAELoader
Class type VAELoader, one field vae_name set to ae.safetensors. The VAE only matters at the very end (decode), but it must be loaded.
4-5. CLIP Text Encode → FluxGuidance
Your prompt goes into a standard CLIP Text Encode (Prompt) node, whose CLIP input comes from the DualCLIPLoader. The output then passes through a FluxGuidance node. FLUX.1 [dev] is a guidance-distilled model: instead of classic CFG it bakes a single guidance scalar into the conditioning. The default is 3.5 — raise toward 4-5 for stronger prompt adherence on short prompts, lower toward 2-3 for more creative freedom on long prompts.
6-7. EmptyLatentImage → KSampler
EmptyLatentImage sets your resolution (1024×1024 is the FLUX sweet spot). The KSampler then denoises. The non-negotiable FLUX setting here: cfg = 1.0. FLUX does not use classifier-free guidance the way SD does — the guidance already lives in the FluxGuidance node — so a CFG above 1.0 will wash the image out. Typical sampler: euler with the simple scheduler, around 20 steps for [dev].
FLUX also needs no negative prompt; its prompt-following is strong enough that the negative conditioning is usually left empty (or fed a blank CLIP Text Encode).
8. VAEDecode → SaveImage
The KSampler's latent output goes to VAEDecode (fed by the VAELoader's VAE), then to SaveImage. That's the whole pipeline.
Copy-paste FLUX workflow JSON (API format)
This is a minimal, valid ComfyUI FLUX.1 [dev] graph in the API/prompt JSON format. Edit the filenames to match exactly what is in your folders, then load it via the ComfyUI menu (or POST it to the /prompt endpoint). Note CFG is 1.0 and DualCLIPLoader type is "flux".
{
"10": { "class_type": "VAELoader", "inputs": { "vae_name": "ae.safetensors" } },
"11": { "class_type": "DualCLIPLoader", "inputs": {
"clip_name1": "t5xxl_fp16.safetensors",
"clip_name2": "clip_l.safetensors",
"type": "flux" } },
"12": { "class_type": "UNETLoader", "inputs": {
"unet_name": "flux1-dev.safetensors",
"weight_dtype": "default" } },
"5": { "class_type": "EmptyLatentImage", "inputs": {
"width": 1024, "height": 1024, "batch_size": 1 } },
"6": { "class_type": "CLIPTextEncode", "inputs": {
"text": "a glass of orange juice on a wooden table, photorealistic",
"clip": ["11", 0] } },
"26": { "class_type": "FluxGuidance", "inputs": {
"guidance": 3.5, "conditioning": ["6", 0] } },
"33": { "class_type": "CLIPTextEncode", "inputs": {
"text": "", "clip": ["11", 0] } },
"3": { "class_type": "KSampler", "inputs": {
"seed": 42, "steps": 20, "cfg": 1.0,
"sampler_name": "euler", "scheduler": "simple", "denoise": 1.0,
"model": ["12", 0],
"positive": ["26", 0],
"negative": ["33", 0],
"latent_image": ["5", 0] } },
"8": { "class_type": "VAEDecode", "inputs": {
"samples": ["3", 0], "vae": ["10", 0] } },
"9": { "class_type": "SaveImage", "inputs": {
"filename_prefix": "flux", "images": ["8", 0] } }
}
The numbered keys are node IDs; the ["11", 0] arrays are wires ("take output slot 0 of node 11"). That linking syntax is the whole secret to reading any ComfyUI JSON: every input is either a literal value or a [nodeId, outputIndex] reference.
Reading articles is good. Building is better.
Free account = 20+ free chapters across 20 courses, with a per-chapter AI tutor. No card. Cancel anytime if you ever upgrade.
Low-VRAM path: the FLUX GGUF loader node
If 12B at fp8 still does not fit, switch to GGUF. GGUF is a heavily quantized format (Q4, Q5, Q6, Q8) that shrinks the FLUX transformer dramatically. You need the ComfyUI-GGUF custom node by city96, installed via ComfyUI Manager or by git-cloning it into ComfyUI/custom_nodes.
It swaps two nodes into your graph:
- Unet Loader (GGUF) — replaces UNETLoader; loads a .gguf transformer from
models/unet/(e.g. flux1-dev-Q4_K_S.gguf) - DualCLIPLoader (GGUF) — optional; lets the T5 encoder also be a GGUF (e.g. t5-v1_1-xxl-encoder-Q5_K_M.gguf) for further savings
Everything downstream (VAELoader, CLIP Text Encode, FluxGuidance, KSampler, VAEDecode) stays identical — you only change the model loader. Pre-quantized FLUX.1-dev GGUFs live on the official city96/FLUX.1-dev-gguf repository, and the node code is on the ComfyUI-GGUF GitHub repo.
| FLUX.1-dev quant | Format | Transformer size | Practical GPU |
|---|---|---|---|
| bf16 (full) | safetensors | ~23 GB | 24 GB (RTX 3090/4090) |
| fp8_e4m3fn | safetensors (weight_dtype) | ~12 GB | 16 GB |
| Q8_0 | GGUF | ~12.7 GB | 16 GB |
| Q5_K_S | GGUF | ~8.3 GB | 12 GB |
| Q4_K_S | GGUF | ~6.8 GB | 8 GB (tight) |
These are transformer-only sizes; you still pay for the T5 encoder (~4.9 GB at fp8) and a little KV/working memory on top. For the full card-by-card breakdown including FLUX.2, see our FLUX VRAM requirements by GPU guide.
What I measured running this graph locally
On an RTX 3090 (24 GB), the full bf16 FLUX.1 [dev] graph above generated a single 1024×1024 image at 20 euler/simple steps in roughly 18-24 seconds once the model was warm in VRAM, drawing close to the full 24 GB. Dropping the UNETLoader weight_dtype to fp8_e4m3fn pulled VRAM down to roughly 16-17 GB with only a small, hard-to-spot quality change and a slightly faster step time. Switching to a Q4_K_S GGUF on a 12 GB card (a 3060) ran but was noticeably slower per step and tighter on memory — usable for iteration, not for batches. Treat all of these as approximate, single-machine numbers; your sampler, resolution, step count, and ComfyUI version will move them.
The clearest takeaway from testing: the model spilling out of VRAM is what kills speed, not the quant itself. Pick the largest quant that fully fits and leaves a couple of GB of headroom, and FLUX is comfortable to iterate with.
FLUX.2 in ComfyUI: what changed
FLUX.2 [dev] (a 32B model released by Black Forest Labs on November 25, 2025) runs in ComfyUI too, with day-0 support — but the graph is different in one important way: FLUX.2 replaces the T5 + CLIP-L dual text encoder with a single Mistral-3 24B vision-language model. So a FLUX.2 workflow does not use the DualCLIPLoader described above; it uses FLUX.2's own loader for the Mistral-based encoder. FLUX.2 [dev] is heavy (32B; the official NVIDIA-partnered fp8 build cuts VRAM by roughly 40%), while the lighter FLUX.2 [klein] models (4B and 9B, released January 15, 2026; the 4B is Apache-2.0 licensed) are the consumer-GPU path — the 4B fits in around 8 GB and the 9B around 15 GB at fp8.
If FLUX.2 is your target, follow the FLUX.2-specific lineup and VRAM tiers in our run FLUX locally guide; the node-by-node JSON in this article is written for FLUX.1 [dev], which remains the most widely used FLUX graph on consumer hardware.
Troubleshooting common ComfyUI FLUX errors
- DualCLIPLoader dropdown is empty / "value not in list". Your text encoders are in the wrong folder for your ComfyUI version. Put clip_l.safetensors and t5xxl_*.safetensors in
models/text_encoders/(new) ormodels/clip/(legacy), then refresh the browser. - Output is noise / washed out / oversaturated. CFG is not 1.0. Set the KSampler cfg to exactly 1.0 — FLUX uses FluxGuidance, not CFG.
- "type" mismatch or garbage text rendering. The DualCLIPLoader
typefield is set to sdxl/sd3 instead of flux. - Out of memory on load. Set UNETLoader weight_dtype to fp8_e4m3fn, switch the T5 to t5xxl_fp8_e4m3fn, or move to a GGUF quant via the Unet Loader (GGUF).
- Tried to load a normal checkpoint. FLUX has no single Load Checkpoint node — you must wire UNETLoader + DualCLIPLoader + VAELoader separately. Loading flux1-dev through Load Checkpoint will fail.
- Black or corrupt image at the end. Wrong VAE. FLUX needs its own ae.safetensors in
models/vae/selected in the VAELoader; an SD/SDXL VAE will not decode FLUX latents correctly.
For an alternative front-end if ComfyUI's node graph feels heavy, our SD Forge guide covers a simpler UI that also runs FLUX.
Key Takeaways
- FLUX needs three loaders, not one. Load Diffusion Model (UNETLoader) + DualCLIPLoader (clip_l + t5xxl) + VAELoader (ae.safetensors) — there is no single Load Checkpoint node for FLUX.
- CFG must be 1.0 in the KSampler; guidance lives in the FluxGuidance node (default 3.5). Getting this wrong is the most common cause of washed-out output.
- Set DualCLIPLoader type to "flux", place files in models/diffusion_models, models/text_encoders, and models/vae (legacy: models/unet, models/clip), and use t5xxl_fp8 + UNETLoader weight_dtype fp8_e4m3fn on smaller cards.
- The workflow JSON wires inputs as [nodeId, outputIndex] — once you can read that, you can read or edit any ComfyUI FLUX graph by hand.
- GGUF (ComfyUI-GGUF, Unet Loader (GGUF)) is the sub-12 GB path; FLUX.2 is the newer, heavier line (32B dev; 4B/9B klein for consumer GPUs) and uses a Mistral-based encoder instead of the dual CLIP.
Next Steps
- New to the interface entirely? Read the ComfyUI complete guide for install, ControlNet and the basics before wiring FLUX.
- Want the full FLUX model lineup and which fits your GPU? See run FLUX locally.
- Sizing a card for FLUX or FLUX.2? Use FLUX VRAM requirements by GPU.
- Prefer a lighter UI? The SD Forge guide runs FLUX without the node graph.
Generating images locally? Take it further.
From FLUX and ComfyUI setup to building real image pipelines and apps. First chapter free, no card.
Liked this? 20 full AI courses are waiting.
From fundamentals to RAG, agents, MCP servers, voice AI, and production deployment with real GitHub repos. First chapter free, every course.
Build Real AI on Your Machine
RAG, agents, NLP, vision, and MLOps - chapters across 20 courses that take you from reading about AI to building AI.
Want structured AI education?
20 courses, 495+ chapters, from $9. Understand AI, don't just use it.
Continue Your Local AI Journey
- PILLARRun FLUX.1 Locally in 2026: VRAM Needs + 5-Minute Setup
- Best GPU for Local AI Image Generation (2026): Ranked
- Best Local AI Image Models 2026: FLUX vs SDXL vs Qwen
- ComfyUI 2026: Install + ControlNet + FLUX Setup (Full Tutorial)
- FLUX VRAM Requirements by GPU (2026): 8GB to 24GB Guide
- Image-to-Text AI: 89% Caption Accuracy (2026)
- Ollama Image Generation: Run Z-Image & FLUX.2 Locally (2026)
- Run FLUX on 6-8GB VRAM (2026): GGUF & Offloading
- Run FLUX.2 Locally (2026): Klein 9B/4B VRAM + ComfyUI
- SD Forge Guide 2026: Faster A1111 with Native Flux Support
Comments (0)
No comments yet. Be the first to share your thoughts!