Open WebUI Setup Guide: Local ChatGPT with Ollama
Before we dive deeper...
Get your free AI Starter Kit
Join 12,000+ developers. Instant download: Career Roadmap + Fundamentals Cheat Sheets.
Open WebUI is a free, self-hosted ChatGPT alternative with 126,000+ GitHub stars that gives you a polished chat interface for local AI models. Install it in one command: docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui ghcr.io/open-webui/open-webui:main. It supports RAG, voice, multi-user accounts, and works with any Ollama model.
Open WebUI Quick Start (5 Minutes)
# 1. Install Ollama (if not already)
curl -fsSL https://ollama.com/install.sh | sh
ollama pull llama3.2
# 2. Run Open WebUI with Docker
docker run -d -p 3000:8080 \
--add-host=host.docker.internal:host-gateway \
-v open-webui:/app/backend/data \
--name open-webui \
ghcr.io/open-webui/open-webui:main
# 3. Open http://localhost:3000 → Create account → Chat!
What Is Open WebUI?
Open WebUI is the most popular self-hosted AI chat interface, providing a polished ChatGPT-like experience for local models running through Ollama. It is the default recommendation in virtually every "run AI locally" tutorial, and for good reason: it works out of the box, looks professional, and packs features that rival commercial platforms.
Open WebUI at a Glance
| Metric | Detail |
|---|---|
| GitHub Stars | 126,000+ (top 200 on all of GitHub) |
| Contributors | 400+ |
| License | MIT (fully permissive) |
| First Release | October 2023 (as Ollama WebUI) |
| Current Version | 0.5.x (March 2026) |
| Backend | Python (FastAPI) |
| Frontend | SvelteKit |
| Docker Image Size | ~1.5 GB |
| Supported Backends | Ollama, OpenAI API, any OpenAI-compatible endpoint |
Why Open WebUI Over CLI?
Running ollama run llama3.2 in a terminal works, but Open WebUI gives you:
- Conversation history — All chats saved and searchable
- Multi-model switching — Jump between models mid-conversation
- Document upload (RAG) — Ask questions about your PDFs and files
- Voice input/output — Talk to your AI hands-free
- Image generation — Connect AUTOMATIC1111 or ComfyUI
- Web search — Pull live information into responses
- Multi-user accounts — Share one server with your team or family
- Model management — Pull, delete, and create models from the UI
- Custom system prompts — Save personas and reuse them
- API access — OpenAI-compatible API for integrations
Prerequisites
Before installing Open WebUI, you need two things:
1. Docker Desktop
Open WebUI runs as a Docker container. Install Docker Desktop for your platform:
- macOS: Download from docker.com or
brew install --cask docker - Windows: Download Docker Desktop from docker.com (requires WSL2)
- Linux: Install Docker Engine:
# Ubuntu/Debian
sudo apt update && sudo apt install docker.io docker-compose-v2
sudo usermod -aG docker $USER
# Log out and back in for group changes
Verify Docker is running:
docker --version
# Docker version 27.x.x
2. Ollama
Open WebUI connects to Ollama for running local models. If you haven't installed it:
# macOS
brew install ollama
# Linux
curl -fsSL https://ollama.com/install.sh | sh
# Windows — download installer from ollama.com
Start Ollama and pull a model:
ollama serve # Start the Ollama server (runs on port 11434)
ollama pull llama3.2 # 2GB download, great starter model
For a detailed Ollama installation walkthrough, see our Ollama Windows installation guide or Mac local AI setup guide.
Installation: Docker (Recommended)
Standard Setup — Ollama on Same Machine
This is the most common setup. Open WebUI runs in Docker and connects to Ollama running on your host machine:
docker run -d -p 3000:8080 \
--add-host=host.docker.internal:host-gateway \
-v open-webui:/app/backend/data \
--name open-webui \
--restart always \
ghcr.io/open-webui/open-webui:main
What each flag does:
| Flag | Purpose |
|---|---|
-d | Run in background (detached) |
-p 3000:8080 | Map host port 3000 to container port 8080 |
--add-host=host.docker.internal:host-gateway | Let container reach host's Ollama |
-v open-webui:/app/backend/data | Persist data (chats, settings, uploads) |
--name open-webui | Name the container for easy management |
--restart always | Auto-start on boot |
Bundled Setup — Ollama Inside Docker
If you want everything in one container (no separate Ollama install needed):
docker run -d -p 3000:8080 \
--gpus all \
-v ollama:/root/.ollama \
-v open-webui:/app/backend/data \
--name open-webui \
--restart always \
ghcr.io/open-webui/open-webui:ollama
Note: The :ollama tag bundles Ollama inside the container. Use --gpus all for NVIDIA GPU passthrough (requires NVIDIA Container Toolkit on Linux).
Remote Ollama Setup
Connect to Ollama running on another machine on your network:
docker run -d -p 3000:8080 \
-e OLLAMA_BASE_URL=http://192.168.1.100:11434 \
-v open-webui:/app/backend/data \
--name open-webui \
--restart always \
ghcr.io/open-webui/open-webui:main
On the remote machine, ensure Ollama allows external connections:
# On the Ollama host machine
OLLAMA_HOST=0.0.0.0 ollama serve
Verify Installation
# Check container is running
docker ps
# Should show:
# CONTAINER ID IMAGE STATUS PORTS
# abc123 ghcr.io/open-webui/open-webui:main Up 2 minutes 0.0.0.0:3000->8080/tcp
# View logs if needed
docker logs open-webui
Open your browser to http://localhost:3000. You should see the Open WebUI login screen.
Installation: pip (Alternative)
If you prefer not to use Docker:
# Requires Python 3.11+
pip install open-webui
# Start the server
open-webui serve
# Server starts on http://localhost:8080
The pip install is simpler but has caveats:
- You manage Python dependencies yourself
- Updates require
pip install --upgrade open-webui - Less isolation than Docker
- Good for development and testing
First-Time Setup
Step 1: Create Your Admin Account
When you first open http://localhost:3000, you'll see a sign-up form. The first account you create automatically becomes the administrator. Choose a strong password — this account controls all settings.
Step 2: Verify Ollama Connection
After logging in:
- Click the model dropdown at the top of the chat
- You should see your Ollama models listed (e.g., llama3.2)
- If no models appear, check Admin → Settings → Connections
If Ollama is not detected:
- Ensure Ollama is running (
ollama serve) - Verify the connection URL in Settings (default:
http://host.docker.internal:11434for Docker,http://localhost:11434for pip)
Step 3: Pull Additional Models
You can pull models directly from the Open WebUI interface:
- Go to Admin → Settings → Models
- Enter a model name (e.g.,
qwen2.5:7b) and click Pull - Or continue using the terminal:
ollama pull qwen2.5:7b
Recommended starter models by hardware:
| Your RAM/VRAM | Recommended Model | Pull Command |
|---|---|---|
| 8GB | llama3.2 (3B) | ollama pull llama3.2 |
| 16GB | llama3.1:8b | ollama pull llama3.1:8b |
| 24GB+ | qwen2.5:32b | ollama pull qwen2.5:32b |
| 32GB+ (Apple Silicon) | deepseek-r1:32b | ollama pull deepseek-r1:32b |
For a detailed model comparison, see our guide to the best local AI models for 8GB RAM.
Step 4: Start Chatting
Select a model from the dropdown and start typing. Open WebUI supports:
- Markdown rendering in responses
- Code syntax highlighting with copy buttons
- LaTeX math rendering
- Conversation branching — edit and regenerate from any message
- System prompts — set custom instructions per chat
Key Features Deep Dive
RAG: Chat with Your Documents
Open WebUI's built-in RAG (Retrieval-Augmented Generation) lets you upload documents and ask questions about them.
How to use RAG:
- In any chat, click the + button or drag-and-drop files
- Supported formats: PDF, TXT, DOCX, CSV, MD, HTML, EPUB
- Open WebUI chunks the document, generates embeddings, and stores them locally
- Ask questions — the AI retrieves relevant sections automatically
Configure embeddings for better accuracy:
- Pull an embedding model:
ollama pull nomic-embed-text - Go to Admin → Settings → Documents
- Set the embedding model to
nomic-embed-text - Adjust chunk size (default 1500 tokens works well for most documents)
Create Knowledge Bases:
For recurring reference documents (company docs, codebases, research papers):
- Go to Workspace → Knowledge
- Create a new collection
- Upload multiple files — they're indexed together
- Reference the collection in any chat with
#collection-name
Web Search Integration
Enable web search to give your local AI access to current information:
- Go to Admin → Settings → Web Search
- Choose a search provider:
- SearXNG (self-hosted, private — recommended)
- Google PSE (requires API key)
- Brave Search (requires API key)
- DuckDuckGo (no key needed, rate-limited)
- Toggle on web search in your chat with the globe icon
Voice Input and Output
Open WebUI supports speech-to-text and text-to-speech:
- Speech-to-Text: Click the microphone icon to dictate. Uses your browser's Web Speech API by default, or configure Whisper (local) for better accuracy
- Text-to-Speech: Enable in Settings → Audio. Responses are read aloud using browser TTS or a configured TTS engine
Image Generation
Connect a local image generation backend:
- Go to Admin → Settings → Images
- Set the engine to AUTOMATIC1111 or ComfyUI
- Enter the API URL (e.g.,
http://localhost:7860for AUTOMATIC1111) - Generate images directly in chat by asking the AI to create images
Custom Modelfiles
Create specialized personas with custom system prompts and parameters:
- Go to Workspace → Modelfiles
- Click Create New
- Set a name, system prompt, and base model
- Example: Create a "Code Reviewer" persona using
qwen2.5-coder:7bwith instructions to focus on bugs, security, and performance
Pipelines and Functions
Open WebUI supports custom Python functions that run alongside your AI:
- Filters — Modify messages before/after they reach the model
- Actions — Add custom buttons to the chat interface
- Pipes — Connect external APIs or custom logic
Install community pipelines from the admin panel or write your own in Python.
Docker Compose Setup (Production)
For a more manageable setup, use Docker Compose:
# docker-compose.yml
version: '3.8'
services:
ollama:
image: ollama/ollama
container_name: ollama
ports:
- "11434:11434"
volumes:
- ollama_data:/root/.ollama
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
restart: always
open-webui:
image: ghcr.io/open-webui/open-webui:main
container_name: open-webui
ports:
- "3000:8080"
environment:
- OLLAMA_BASE_URL=http://ollama:11434
volumes:
- open-webui_data:/app/backend/data
depends_on:
- ollama
restart: always
volumes:
ollama_data:
open-webui_data:
Start everything:
docker compose up -d
This setup runs Ollama and Open WebUI as separate containers with proper networking, GPU passthrough, and persistent storage.
Administration and Multi-User Setup
User Management
As admin, you control who can access your Open WebUI instance:
-
Admin → Settings → General:
- Enable/disable new user registration
- Set default user role (user or pending approval)
- Configure session timeout
-
Admin → Users:
- View all registered users
- Change user roles (admin, user)
- Delete accounts
Security Best Practices
If exposing Open WebUI beyond localhost:
- Use a reverse proxy (Nginx, Caddy, Traefik) with HTTPS
- Disable registration after creating accounts
- Set a strong admin password
- Use a firewall to restrict access by IP if on a local network
Example Nginx reverse proxy config:
server {
listen 443 ssl;
server_name ai.yourdomain.com;
ssl_certificate /path/to/cert.pem;
ssl_certificate_key /path/to/key.pem;
location / {
proxy_pass http://localhost:3000;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}
Hardware Requirements
Open WebUI itself is very lightweight. The heavy lifting is done by Ollama and your AI models.
Minimum Requirements
| Component | Minimum | Recommended |
|---|---|---|
| CPU | 4 cores | 8+ cores |
| RAM | 8 GB | 16-32 GB |
| Storage | 10 GB free | 50+ GB (for multiple models) |
| GPU | None (CPU inference) | 8+ GB VRAM |
| Network | Local only | LAN for multi-user |
| Docker | Docker Desktop 4.x+ | Latest stable |
Performance by Hardware Tier
| Hardware | Best Model | Speed (tok/s) | Experience |
|---|---|---|---|
| 8GB RAM, no GPU | llama3.2 (3B) | 5-10 | Basic chat, slow |
| 16GB RAM, RTX 3060 | llama3.1:8b | 30-50 | Good for daily use |
| 32GB RAM, RTX 4090 | qwen2.5:32b | 40-60 | Excellent, near-GPT-4 quality |
| 64GB, Mac M4 Max | deepseek-r1:32b | 30-45 | Great reasoning, smooth |
| 48GB, RTX 5090 | llama3.3:70b (Q4) | 20-30 | Top-tier local experience |
For detailed GPU comparisons, see our best GPUs for AI guide and VRAM requirements guide.
Updating Open WebUI
Docker Update
# Pull the latest image
docker pull ghcr.io/open-webui/open-webui:main
# Stop and remove the old container
docker stop open-webui
docker rm open-webui
# Re-run with the same command you used originally
docker run -d -p 3000:8080 \
--add-host=host.docker.internal:host-gateway \
-v open-webui:/app/backend/data \
--name open-webui \
--restart always \
ghcr.io/open-webui/open-webui:main
Your chats, settings, and uploads are preserved in the open-webui Docker volume.
Docker Compose Update
docker compose pull
docker compose up -d
pip Update
pip install --upgrade open-webui
open-webui serve
Troubleshooting
"Cannot connect to Ollama"
This is the most common issue. Fix it step by step:
# 1. Verify Ollama is running
ollama serve
# or: curl http://localhost:11434/api/version
# 2. Check if Docker can reach the host
docker exec open-webui curl http://host.docker.internal:11434/api/version
# 3. If using Linux, the host.docker.internal alias may not work.
# Use your machine's IP instead:
docker run -d -p 3000:8080 \
-e OLLAMA_BASE_URL=http://172.17.0.1:11434 \
-v open-webui:/app/backend/data \
--name open-webui \
ghcr.io/open-webui/open-webui:main
Models Not Showing Up
# Verify models are installed
ollama list
# Pull a model if none exist
ollama pull llama3.2
# Refresh models in Open WebUI
# Click the refresh icon next to the model dropdown
Slow Response Times
- Check GPU usage:
ollama ps— verify model is on GPU, not CPU - Use a smaller model: Switch from 32B to 8B for faster responses
- Increase Docker memory: Docker Desktop → Settings → Resources → increase RAM
- Close other GPU apps: Games, video editors, and other AI tools compete for VRAM
Container Won't Start
# Check logs for errors
docker logs open-webui
# Common fix: port conflict
# Change 3000 to another port if already in use
docker run -d -p 8080:8080 ...
# Reset everything (WARNING: deletes all data)
# docker volume rm open-webui
RAG Not Working
- Verify an embedding model is set: Admin → Settings → Documents
- Pull the embedding model:
ollama pull nomic-embed-text - Re-upload documents after changing the embedding model
- Check file format is supported (PDF, TXT, DOCX, CSV, MD)
Open WebUI vs Alternatives
| Feature | Open WebUI | Jan | LM Studio | AnythingLLM | text-generation-webui |
|---|---|---|---|---|---|
| GitHub Stars | 126K+ | 27K+ | N/A (closed) | 30K+ | 41K+ |
| Interface | Web (browser) | Desktop app | Desktop app | Web/Desktop | Web |
| Multi-user | Yes | No | No | Yes | No |
| RAG Built-in | Yes | No | No | Yes | No |
| Voice I/O | Yes | No | No | Yes | Yes |
| Image Gen | Yes (via backends) | No | No | No | Yes |
| Web Search | Yes | No | No | Yes | Yes |
| Docker | Yes (primary) | No | No | Yes | Yes |
| Mobile-friendly | Yes (responsive) | No | No | Yes | No |
| Ollama Support | Native | Native | Own backend | Via API | Via API |
When to Choose Open WebUI
- You want a ChatGPT-like web interface for local AI
- You need multi-user access for a team or family
- You want RAG (document chat) without extra setup
- You prefer Docker-based deployment
- You want the largest community and fastest development
When to Choose Alternatives
- Jan: Simpler desktop app, no Docker needed — see our Jan vs LM Studio vs Ollama comparison
- LM Studio: Integrated model downloading + inference (no Ollama needed)
- AnythingLLM: Advanced RAG with workspace management — see our AnythingLLM setup guide
Environment Variables Reference
Customize Open WebUI behavior with environment variables (pass with -e in Docker):
| Variable | Default | Description |
|---|---|---|
OLLAMA_BASE_URL | http://localhost:11434 | Ollama API endpoint |
WEBUI_SECRET_KEY | Auto-generated | JWT secret for auth |
ENABLE_SIGNUP | true | Allow new user registration |
DEFAULT_USER_ROLE | pending | Role for new users |
WEBUI_AUTH | true | Enable authentication |
DATA_DIR | /app/backend/data | Data storage path |
ENABLE_RAG_WEB_SEARCH | false | Enable web search for RAG |
RAG_EMBEDDING_MODEL | sentence-transformers | Default embedding model |
Key Takeaways
- Open WebUI is the #1 local AI interface — 126K+ GitHub stars, MIT license, active development
- One Docker command gets you a full ChatGPT-like experience with local models
- RAG, voice, web search, image generation — all built in, no plugins needed
- Multi-user support makes it ideal for teams, families, and self-hosted deployments
- Works with any Ollama model plus OpenAI-compatible APIs
- Lightweight frontend — the hardware requirements are driven by your AI models, not Open WebUI itself
- Active community — weekly releases, 400+ contributors, extensive documentation
Next Steps
- Choose the best local AI models for your hardware
- Set up RAG with local models for document-powered AI
- Compare Jan vs LM Studio vs Ollama for alternatives
- Install Continue.dev for AI coding alongside Open WebUI
- Check VRAM requirements to pick the right model size
Open WebUI transforms local AI from a command-line experiment into a polished, daily-driver experience. Whether you're running a single model on a laptop or serving a team from a GPU workstation, Open WebUI provides the interface that makes local AI practical and enjoyable.
Ready to start your AI career?
Get the complete roadmap
Download the AI Starter Kit: Career path, fundamentals, and cheat sheets used by 12K+ developers.
Want structured AI education?
10 courses, 160+ chapters, from $9. Understand AI, don't just use it.
Continue Your Local AI Journey
Comments (0)
No comments yet. Be the first to share your thoughts!