Is self-hosted Dify really free?

Yes. Dify's open-source edition includes all core features: workflow builder, RAG engine, agent capabilities, API generation, and multi-user workspaces. The paid cloud version only adds managed hosting and premium support. Self-hosting costs nothing beyond your hardware and electricity.

How much RAM does Dify need without running models?

Dify's Docker stack (API, worker, PostgreSQL, Redis, Weaviate, Nginx) consumes approximately 2.5GB RAM at idle. During active document indexing, usage spikes to 4-6GB. If you run Ollama on the same machine, add the model's memory requirements on top -- a 7B model needs about 4.5GB additional RAM.

Can I use Dify with Ollama instead of OpenAI?

Absolutely. Dify natively supports Ollama as a model provider. Go to Settings > Model Providers, add your Ollama endpoint (http://host.docker.internal:11434 on Mac/Windows or your machine IP on Linux), and Dify auto-discovers all available models. You can set Ollama models as your default for all apps.

What's the difference between Dify and Flowise?

Dify is a full AI application platform with built-in RAG, user management, prompt IDE, and auto-generated APIs. Flowise is a LangChain-based flow builder focused on chatbot prototyping. Dify is better for production deployments with teams; Flowise is faster for single-developer prototypes.

Can I run Dify on a Raspberry Pi or ARM device?

Dify provides ARM64 Docker images, so it technically runs on Raspberry Pi 5 or similar ARM devices with 8GB+ RAM. Performance will be limited -- document indexing is slow and you're restricted to very small models (1-3B parameters). For anything beyond basic testing, use an x86 machine with 16GB+ RAM.

How do I update Dify to the latest version?

Navigate to your dify/docker directory, run 'docker compose pull' to fetch new images, then 'docker compose down && docker compose up -d' to restart. Your data persists in Docker volumes. Check GitHub release notes before major version upgrades as they may include database migration steps.

Does Dify support multiple users and teams?

Yes. Self-hosted Dify includes full multi-tenant workspace support. You can create separate workspaces for different teams, each with their own apps, datasets, and API keys. User roles include owner, admin, and member with granular permissions.

Dify Self-Hosted: Deploy Your Own AI Platform

Q: How do I backup my self-hosted Dify data?

Backup three things: PostgreSQL database (pg_dump), Docker volumes (for uploaded files and vector data), and your .env configuration. Run 'docker compose exec db pg_dump -U postgres dify > backup.sql' for the database. For volumes, use 'docker run --rm -v volume_name:/data -v $(pwd):/backup alpine tar czf /backup/volume_backup.tar.gz /data'.

Published on April 10, 2026 • 22 min read

Quick Start: Dify Running in 5 Minutes

Get Dify running locally with three commands:

Clone the repo: git clone https://github.com/langgenius/dify.git && cd dify/docker
Copy config: cp .env.example .env
Launch everything: docker compose up -d

Open http://localhost and create your admin account. That's your private AI platform.

What you'll get from this guide:

A fully private Dify instance on your own hardware
Ollama connected as a zero-cost local model provider
A working RAG pipeline that indexes your documents
API endpoints you can call from any application
Production hardening for team deployments

Dify has crossed 53,000 GitHub stars for a reason. It gives you a visual builder for AI workflows, built-in RAG, prompt management, and API generation -- all behind a clean web interface. The self-hosted version is free under the open-source license, with no feature restrictions compared to their cloud offering.

Most people discover Dify after hitting the walls of simpler tools. If you've outgrown basic chatbot UIs and need structured AI applications -- customer support bots, document analysis pipelines, internal knowledge bases -- Dify handles that without writing a framework from scratch.

If you're already using visual AI builders, you'll want to understand how Dify compares to Flowise and n8n for local AI workflows. For RAG specifically, our RAG local setup guide covers the fundamentals that Dify builds on top of.

Reading articles is good. Building is better.

Free account = 20+ free chapters across 20 courses, with a per-chapter AI tutor. No card. Cancel anytime if you ever upgrade.

Start free in 30 seconds See pricing

What Is Dify and Why Self-Host
Architecture Overview
System Requirements
Docker Compose Deployment
Connecting Ollama as Model Provider
Building Your First AI App
RAG Pipeline Setup
API Access and Integration
Environment Configuration
Scaling and Production Hardening
Dify vs Flowise vs n8n

What Is Dify and Why Self-Host {#what-is-dify}

Dify is an open-source LLM application development platform. Think of it as a middle layer between raw model APIs and your end users. You design AI workflows visually, connect knowledge bases, and Dify generates API endpoints you can plug into any frontend.

Why self-host instead of using Dify Cloud:

Factor	Dify Cloud	Self-Hosted
Data privacy	Data on Dify servers	Everything stays on your machine
Cost at scale	$59-599/month	Free (you pay for hardware)
Model freedom	Limited providers	Any model, including local Ollama
Customization	Standard features	Fork and modify anything
Latency	Internet round-trip	Sub-100ms on local network
Team size limits	Plan-dependent	Unlimited

The self-hosted version includes every feature: workflow builder, RAG engine, agent capabilities, prompt IDE, annotation/logging, and multi-tenant workspaces. The only thing you lose is managed hosting.

Architecture Overview {#architecture-overview}

Dify's Docker Compose stack runs seven services:

┌─────────────────────────────────────────────┐
│                  Nginx (port 80)            │
│              Reverse proxy + static         │
├──────────────────┬──────────────────────────┤
│   Web Frontend   │      API Server          │
│   (Next.js)      │      (Flask/Python)      │
├──────────────────┴──────────────────────────┤
│              Worker (Celery)                │
│         Background jobs + indexing          │
├─────────────┬───────────┬───────────────────┤
│  PostgreSQL │   Redis   │    Weaviate       │
│  (metadata) │  (cache)  │  (vector store)   │
└─────────────┴───────────┴───────────────────┘
         │
         ▼
   Ollama (external, port 11434)

Nginx handles routing and serves the frontend
API server manages all business logic, model routing, and app execution
Worker processes async tasks: document indexing, dataset embedding, scheduled runs
PostgreSQL stores users, apps, conversations, datasets metadata
Redis handles caching, rate limiting, and Celery task queues
Weaviate is the default vector database for RAG embeddings

The entire stack consumes about 2.5GB RAM at idle. During active RAG indexing, expect 4-6GB.

Reading articles is good. Building is better.

Free account = 20+ free chapters across 20 courses, with a per-chapter AI tutor. No card. Cancel anytime if you ever upgrade.

Start free in 30 seconds See pricing

System Requirements {#system-requirements}

Minimum Hardware

Component	Minimum	Recommended
CPU	4 cores	8+ cores
RAM	8GB (Dify only)	16GB+ (Dify + Ollama)
Storage	20GB SSD	100GB+ SSD
OS	Linux, macOS, or WSL2	Ubuntu 22.04+
Docker	v20.10+	Latest stable
Docker Compose	v2.0+	v2.20+

If you're running Ollama on the same machine, add the model's memory requirements on top. A 7B model needs roughly 4.5GB, so 16GB total RAM is the practical minimum for Dify + a usable model.

Software Prerequisites

# Check Docker version
docker --version   # Need 20.10+

# Check Docker Compose
docker compose version   # Need v2.0+

# Check available RAM
free -h   # Linux
vm_stat   # macOS

# Check disk space
df -h /   # Need 20GB+ free

Docker Compose Deployment {#docker-compose-deployment}

Step 1: Clone the Repository

git clone https://github.com/langgenius/dify.git
cd dify/docker

Step 2: Configure Environment Variables

cp .env.example .env

Open .env and set these critical values:

# Security - change these immediately
SECRET_KEY=$(openssl rand -hex 32)
INIT_PASSWORD=your-secure-admin-password

# Database
DB_USERNAME=postgres
DB_PASSWORD=$(openssl rand -hex 16)
DB_HOST=db
DB_PORT=5432
DB_DATABASE=dify

# Redis
REDIS_HOST=redis
REDIS_PORT=6379
REDIS_PASSWORD=$(openssl rand -hex 16)

# Storage (local filesystem for self-hosted)
STORAGE_TYPE=local
STORAGE_LOCAL_PATH=/app/api/storage

# Vector store
VECTOR_STORE=weaviate
WEAVIATE_ENDPOINT=http://weaviate:8080

Step 3: Launch the Stack

docker compose up -d

First launch pulls about 4GB of images. Subsequent starts take under 30 seconds.

Step 4: Verify All Services

# Check all containers are running
docker compose ps

# Expected output:
# dify-api        running  0.0.0.0:5001->5001/tcp
# dify-web        running  0.0.0.0:3000->3000/tcp
# dify-worker     running
# dify-db         running  0.0.0.0:5432->5432/tcp
# dify-redis      running  0.0.0.0:6379->6379/tcp
# dify-weaviate   running  0.0.0.0:8080->8080/tcp
# dify-nginx      running  0.0.0.0:80->80/tcp

# Check logs if something fails
docker compose logs api --tail 50
docker compose logs worker --tail 50

Step 5: Create Admin Account

Open http://localhost in your browser. You'll see the setup wizard. Create your admin account with the password you set in INIT_PASSWORD.

Connecting Ollama as Model Provider {#connecting-ollama}

This is where Dify gets interesting for local AI. Instead of paying for OpenAI or Anthropic API calls, you point Dify at your Ollama instance.

Step 1: Make Sure Ollama Is Accessible

If Ollama runs on the same machine as Dify, Docker containers need to reach it via the host network:

# Start Ollama with external access
OLLAMA_HOST=0.0.0.0 ollama serve

# Verify it's accessible
curl http://host.docker.internal:11434/api/tags
# On Linux, use your machine's IP instead:
curl http://192.168.1.100:11434/api/tags

Step 2: Pull Models You Need

# Chat model
ollama pull llama3.2:8b

# Embedding model (critical for RAG)
ollama pull nomic-embed-text

# Code model (optional)
ollama pull qwen2.5-coder:7b

Step 3: Add Ollama in Dify Settings

Go to Settings > Model Providers
Find Ollama in the list and click Setup
Enter the base URL:
- macOS/Windows: http://host.docker.internal:11434
- Linux: http://YOUR_MACHINE_IP:11434
Click Save

Dify auto-discovers all models available in your Ollama instance. You'll see them listed immediately.

Step 4: Set Default Models

Go to Settings > Model Providers > System Model Settings and configure:

Default Chat Model: llama3.2:8b
Default Embedding Model: nomic-embed-text
Default Reranking Model: (leave empty unless you have one)

Now every new app you create uses your local models by default. Zero API costs.

Docker Network Configuration (Linux)

On Linux, host.docker.internal may not resolve. Add this to your docker-compose.yml:

services:
  api:
    extra_hosts:
      - "host.docker.internal:host-gateway"
  worker:
    extra_hosts:
      - "host.docker.internal:host-gateway"

Then restart: docker compose up -d

Building Your First AI App {#first-ai-app}

Dify supports four app types. Here's when to use each:

App Type	Use Case	Example
Chat	Conversational interface	Customer support bot
Completion	Single input/output	Email writer, summarizer
Workflow	Multi-step logic	Document processor pipeline
Agent	Tool-calling autonomous	Research assistant with web search

Create a Knowledge Base Chat App

This is the most common use case -- a chatbot that answers questions from your documents.

Click Create App > Chat App
Name it "Internal Knowledge Base"
In the prompt editor, set the system prompt:

You are a helpful assistant that answers questions based on the provided context.
If the context doesn't contain the answer, say so honestly.
Always cite which document your answer comes from.
Do not make up information.

Under Context, click Add Dataset (we'll create this next)
Set Model to your Ollama llama3.2:8b
Set temperature to 0.3 for factual responses
Click Publish

You now have a working chatbot with an API endpoint. The generated URL looks like:

POST http://localhost/v1/chat-messages
Authorization: Bearer app-xxxxxxxxxxxxxxxx

RAG Pipeline Setup {#rag-pipeline}

RAG (Retrieval-Augmented Generation) is where Dify's self-hosted value really shines. Your documents never leave your server.

Step 1: Create a Dataset

Go to Knowledge > Create Dataset
Name it and choose Import from file
Supported formats: PDF, DOCX, TXT, Markdown, CSV, HTML, XLSX
Upload your documents

Step 2: Configure Chunking

Dify offers two chunking modes:

Automatic -- Good for most documents:

Splits by paragraphs and sections
Chunk size: 500-1000 tokens (default)
Overlap: 50 tokens

Custom -- Better for structured data:

Chunk size: 800 tokens
Overlap: 100 tokens
Separator: \n\n (double newline)

For technical documentation, I recommend 800-token chunks with 100-token overlap. Smaller chunks give more precise retrieval but lose context. Larger chunks retain context but dilute relevance scoring.

Step 3: Select Embedding Model

Choose nomic-embed-text from your Ollama models. This 137M parameter model produces 768-dimensional embeddings and handles up to 8192 tokens per chunk. It outperforms OpenAI's text-embedding-ada-002 on most retrieval benchmarks while running entirely on your hardware.

Step 4: Index and Test

Click Save and Process. Dify's worker container handles indexing in the background. For a 100-page PDF, expect 2-5 minutes on a 4-core machine.

Test retrieval quality:

Go to your dataset
Click Hit Testing
Enter a question
Verify the returned chunks contain relevant information

Step 5: Connect Dataset to Your App

Back in your chat app:

Click Context > Add Dataset
Select your dataset
Configure retrieval settings:
- Top K: 3 (number of chunks retrieved)
- Score threshold: 0.5 (minimum relevance)
- Reranking: Enable if you have a reranking model

Test with a question about your documents. The model now answers using your private data.

For deeper RAG fundamentals and optimization techniques, check our complete RAG local setup guide.

API Access and Integration {#api-access}

Every Dify app automatically gets a REST API. This is one of Dify's strongest features -- you build the AI logic visually, then consume it as a standard API.

Get Your API Key

Open your app
Click Access API in the left sidebar
Copy the API key

Chat Completion API

curl -X POST 'http://localhost/v1/chat-messages' \
  -H 'Authorization: Bearer app-your-api-key' \
  -H 'Content-Type: application/json' \
  -d '{
    "inputs": {},
    "query": "What is our refund policy?",
    "response_mode": "streaming",
    "conversation_id": "",
    "user": "user-123"
  }'

Streaming Response

import requests

response = requests.post(
    "http://localhost/v1/chat-messages",
    headers={
        "Authorization": "Bearer app-your-api-key",
        "Content-Type": "application/json"
    },
    json={
        "inputs": {},
        "query": "Summarize Q4 results",
        "response_mode": "streaming",
        "user": "user-123"
    },
    stream=True
)

for line in response.iter_lines():
    if line:
        print(line.decode())

Embedding in a Web App

// Next.js / React example
const response = await fetch('http://your-dify-host/v1/chat-messages', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer app-your-api-key',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    inputs: {},
    query: userMessage,
    response_mode: 'streaming',
    conversation_id: conversationId,
    user: userId,
  }),
})

Dify also provides a built-in embeddable chat widget. Under Access API > Embedded, you get an iframe snippet you can drop into any webpage.

Environment Configuration {#environment-config}

Critical Environment Variables

# .env file reference

# ---- Security ----
SECRET_KEY=your-64-char-hex-string
INIT_PASSWORD=admin-password-change-this

# ---- Performance ----
# Worker concurrency (default 1, increase for heavy indexing)
CELERY_WORKER_CONCURRENCY=4

# API request timeout (seconds)
API_TIMEOUT=300

# Max upload size (MB)
UPLOAD_FILE_SIZE_LIMIT=50

# ---- Model Defaults ----
DEFAULT_LLM_PROVIDER=ollama
DEFAULT_LLM_MODEL=llama3.2:8b

# ---- Logging ----
LOG_LEVEL=INFO
LOG_FILE=/app/api/logs/dify.log

Persistent Storage

Docker volumes ensure your data survives container restarts:

volumes:
  db_data:      # PostgreSQL data
  redis_data:   # Redis cache
  weaviate_data: # Vector embeddings
  storage_data:  # Uploaded files

Backup Strategy

# Backup PostgreSQL
docker compose exec db pg_dump -U postgres dify > backup_$(date +%Y%m%d).sql

# Backup volumes
docker run --rm -v dify_db_data:/data -v $(pwd):/backup \
  alpine tar czf /backup/db_data_backup.tar.gz /data

# Restore PostgreSQL
cat backup_20260410.sql | docker compose exec -T db psql -U postgres dify

Scaling and Production Hardening {#scaling}

Running Behind a Reverse Proxy

For production, put Dify behind Nginx or Caddy with HTTPS:

# /etc/nginx/sites-available/dify
server {
    listen 443 ssl http2;
    server_name dify.yourdomain.com;

    ssl_certificate /etc/letsencrypt/live/dify.yourdomain.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/dify.yourdomain.com/privkey.pem;

    client_max_body_size 50M;

    location / {
        proxy_pass http://127.0.0.1:80;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        # WebSocket support for streaming
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
    }
}

Resource Limits

Add resource constraints in docker-compose.override.yml:

services:
  api:
    deploy:
      resources:
        limits:
          memory: 2G
          cpus: '2.0'
  worker:
    deploy:
      resources:
        limits:
          memory: 4G
          cpus: '4.0'
  db:
    deploy:
      resources:
        limits:
          memory: 1G

Monitoring

# Watch container resource usage
docker stats --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}"

# Check API health
curl -s http://localhost/health | jq .

# Monitor worker queue depth
docker compose exec redis redis-cli llen celery

Dify vs Flowise vs n8n {#comparison}

All three are excellent self-hosted tools, but they solve different problems:

Feature	Dify	Flowise	n8n
Primary focus	LLM app platform	LLM flow builder	General automation
RAG built-in	Full pipeline	Basic	Via plugins
Visual builder	Workflow + prompt IDE	Flow-based	Node-based
API generation	Automatic per app	Per chatflow	Per workflow
Model providers	30+ including Ollama	20+ including Ollama	Via LangChain nodes
Multi-user	Built-in workspaces	Basic auth	Role-based
Dataset management	Full UI with chunking	Manual setup	Not built-in
GitHub stars	53K+	34K+	51K+
Best for	Production AI apps	Prototyping chatbots	Business automation

Pick Dify when you need a production-ready AI application platform with proper dataset management, user workspaces, and auto-generated APIs. It's the most complete package for teams building customer-facing AI products.

Pick Flowise when you want fast prototyping of LangChain-based chatflows without writing code. It's simpler and faster to get started with.

Pick n8n when AI is one part of a larger automation workflow that includes email, databases, webhooks, and third-party services. Read our n8n + Ollama automation guide for that setup.

For the Ollama setup fundamentals that all three tools rely on, see our complete Ollama guide.

Updating Dify

cd dify/docker

# Pull latest images
docker compose pull

# Restart with new images
docker compose down
docker compose up -d

# Check version
docker compose exec api python -c "import app; print(app.__version__)"

Dify follows semantic versioning. Minor updates are safe. Major version bumps may require database migrations -- always check the GitHub release notes before upgrading.

Troubleshooting Common Issues

Weaviate Won't Start

# Check logs
docker compose logs weaviate

# Common fix: increase virtual memory
sudo sysctl -w vm.max_map_count=262144
echo "vm.max_map_count=262144" | sudo tee -a /etc/sysctl.conf

Ollama Connection Refused

# Verify Ollama is listening on all interfaces
curl http://localhost:11434/api/tags

# On Linux, check firewall
sudo ufw status
sudo ufw allow 11434

# Test from inside Docker
docker compose exec api curl http://host.docker.internal:11434/api/tags

Slow Document Indexing

# Increase worker concurrency in .env
CELERY_WORKER_CONCURRENCY=4

# Check worker logs
docker compose logs worker --tail 100 -f

# Monitor embedding throughput
docker compose exec redis redis-cli info | grep instantaneous_ops_per_sec

Conclusion

Self-hosting Dify gives you a production-grade AI platform without recurring API costs or data privacy concerns. The combination of Dify's visual builder with Ollama's local model serving eliminates the two biggest barriers to deploying AI applications: cost and data sovereignty.

Start with a simple chat app connected to one dataset. Once that works, build a workflow app that chains multiple steps -- retrieval, summarization, and structured output. That's where Dify's architecture pays off versus simpler tools.

The official Dify documentation covers advanced topics like custom tool integration, SSO configuration, and multi-tenant isolation.

Running local AI models for the first time? Start with our complete Ollama guide to get your model server running, then come back here to build applications on top of it.

Dify Self-Hosted: Deploy Your Own AI Platform

Want to go deeper than this article?

Quick Start: Dify Running in 5 Minutes

Reading articles is good. Building is better.

Table of Contents

What Is Dify and Why Self-Host {#what-is-dify}

Architecture Overview {#architecture-overview}

Reading articles is good. Building is better.

System Requirements {#system-requirements}

Minimum Hardware

Software Prerequisites

Docker Compose Deployment {#docker-compose-deployment}

Step 1: Clone the Repository

Step 2: Configure Environment Variables

Step 3: Launch the Stack

Step 4: Verify All Services

Step 5: Create Admin Account

Connecting Ollama as Model Provider {#connecting-ollama}

Step 1: Make Sure Ollama Is Accessible

Step 2: Pull Models You Need

Step 3: Add Ollama in Dify Settings

Step 4: Set Default Models

Docker Network Configuration (Linux)

Building Your First AI App {#first-ai-app}

Create a Knowledge Base Chat App

RAG Pipeline Setup {#rag-pipeline}

Step 1: Create a Dataset

Step 2: Configure Chunking

Step 3: Select Embedding Model

Step 4: Index and Test

Step 5: Connect Dataset to Your App

API Access and Integration {#api-access}

Get Your API Key

Chat Completion API

Streaming Response

Embedding in a Web App

Environment Configuration {#environment-config}

Critical Environment Variables

Persistent Storage

Backup Strategy

Scaling and Production Hardening {#scaling}

Running Behind a Reverse Proxy

Resource Limits

Monitoring

Dify vs Flowise vs n8n {#comparison}

Updating Dify

Troubleshooting Common Issues

Weaviate Won't Start

Ollama Connection Refused

Slow Document Indexing

Conclusion

Sold on local AI? Learn to run it for real.

Liked this? 20 full AI courses are waiting.

Local AI Master Research Team

Build Real AI on Your Machine

Want structured AI education?

Continue Your Local AI Journey

How to Install Your First Local AI Model

How to Choose the Right AI Model for Your Computer

Comments (0)

Written by the Local AI Master Team

Build Smarter AI Apps Locally

Build Real AI on Your Machine

🎓 Continue Learning

Related Guides

Continue Learning

RAG Local Setup Guide

n8n + Ollama Automation

Complete Ollama Guide

Grab the AI Starter Kit — career roadmap, cheat sheet, setup guide

Sold on local AI? Learn to run it for real.