Free course — 2 free chapters of every course. No credit card.Start learning free
Self-Hosting

AI on Synology NAS: Docker + Ollama Self-Hosted Setup (2026)

April 23, 2026
19 min read
LocalAimaster Research Team

Want to go deeper than this article?

The AI Learning Path covers this topic and more — hands-on chapters across 10 courses across 10 courses.

AI on Synology NAS: Self-Host Ollama on DSM (2026 Guide)

Published on April 23, 2026 - 19 min read

Quick Start: Ollama in Container Manager in 5 Minutes

Synology DSM 7.2 ships with Container Manager (the rebranded Docker package). On a Plus-series NAS with 8GB+ RAM, you can run a 3B-parameter model on CPU at usable speeds and serve it over your LAN. Steps:

  1. Install Container Manager from Package Center
  2. Create a folder /volume1/docker/ollama via File Station
  3. SSH in and run a single docker run command (below)
  4. Pull a model: docker exec ollama ollama pull phi3:mini
  5. Test from any LAN device: curl http://nas.local:11434/api/generate

That gets you a private, always-on AI endpoint that any device on your network can use without sending a byte to the cloud.


What this guide covers:

  • Compatible Synology models and CPU requirements (x86_64 only)
  • DSM 7.2 Container Manager configuration with persistent volumes
  • Memory sizing - which models fit on 4GB, 8GB, 16GB, and 32GB upgrades
  • Open WebUI deployment on the same NAS
  • DSM reverse proxy with HTTPS and Let's Encrypt
  • Real benchmarks across DS220+, DS923+, DS1522+, and DS1821+

A Synology Plus-series NAS is one of the most underrated pieces of hardware for hosting a small private AI. It is already running 24/7, already has gigabit networking, already has redundant storage, and the better models (DS923+, DS1522+, DS1821+) ship with x86_64 CPUs and DDR4 ECC slots that scale to 32GB. You spend zero additional electricity beyond the few watts the inference job adds, and every machine on your LAN gets a private GPT-class endpoint at http://nas.local:11434.

For the broader self-hosting decision tree, pair this with the air-gapped AI deployment guide and Open WebUI setup guide. For hardware sizing, the AI hardware requirements guide compares NAS, mini-PC, and full desktop builds.

Table of Contents

  1. Why Run AI on Your NAS
  2. Compatibility - Which Synology Models Work
  3. RAM Sizing for AI Workloads
  4. Container Manager Setup
  5. Deploying Ollama
  6. Adding Open WebUI
  7. DSM Reverse Proxy and HTTPS
  8. Benchmarks Across Models
  9. Storage and Snapshot Strategy
  10. Pitfalls and Fixes

Why Run AI on Your NAS {#why-nas}

Three reasons people pick a NAS over a mini-PC or repurposed desktop:

Already running. A Synology Plus-series NAS spends most of its life at 8-15W idle. Adding an LLM container brings it to 20-35W under inference load, which adds maybe $2-4/month on a typical electricity rate. Versus standing up a separate mini-PC, you save the $400 hardware cost and the 8-15W idle baseline.

Storage adjacency. If you are running Photos, Drive, or Note Station on the same NAS, your AI sees that data over a Unix socket, not a network share. RAG pipelines that index your photos for "find that screenshot from last March" or your notes for semantic search run dramatically faster when the model and the data live on the same machine.

True private network endpoint. The Synology NAS sits behind your router, not exposed to the internet. Combined with DSM's Let's Encrypt integration and reverse proxy, you get HTTPS internally without anything ever touching a public cloud.

The catch: NAS CPUs are slow by AI-rig standards. The fastest Synology consumer CPU (Ryzen R1600 in DS923+ / DS1522+) does roughly half the inference throughput of a desktop Ryzen 5600. You will run quantized 3B-7B models, not 70B behemoths. For most home and small-team use cases, that is enough.

Compatibility - Which Synology Models Work {#compatibility}

DSM Container Manager runs Docker images. Ollama publishes only x86_64 (amd64) and arm64 images. Synology ARM-based NAS units (Realtek RTD1296, RTD1619B) cannot run Ollama because the chips lack the SIMD extensions Ollama assumes. Stick to Plus-series and Value-series with Intel or AMD CPUs.

ModelCPUDefault RAMMax RAMAI-Capable?
DS220+Intel J40252GB6GBYes (small models only)
DS224+Intel J41252GB6GBYes (small models only)
DS423+Intel J41252GB6GBYes
DS920+Intel J41254GB8GBYes
DS923+AMD R16004GB32GBYes (best mainstream)
DS1522+AMD R16008GB32GBYes (excellent)
DS1621+AMD V1500B4GB32GBYes
DS1821+AMD V1500B4GB32GBYes (top tier)
DS3622xs+Intel D-153116GB48GBYes (enterprise)
Any J-series ARMRealtek/MarvellvariesvariesNo

The DS923+ at $600 with a 16GB RAM upgrade ($60-80 for a Crucial CT16G4SFRA32A or compatible) is the sweet spot. You get a Ryzen with AVX2/FMA support, 16GB total memory, and four drive bays for around $700 all-in.

RAM Sizing for AI Workloads {#ram-sizing}

DSM itself uses around 1.5GB for the OS, indexing services, and built-in apps. Plan your model around what is left.

Total NAS RAMReserved for DSMAvailable for AIRealistic Models
4 GB1.5 GB2.5 GBPhi-3 Mini Q4 (tight), Gemma 2B Q4
8 GB1.5 GB6.5 GBPhi-3 Mini, Llama 3.2 3B, Mistral 7B Q4 (tight)
16 GB2.0 GB14 GBMistral 7B, Llama 3.1 8B, Gemma 2 9B
32 GB2.5 GB29 GBLlama 3.1 13B, Qwen 14B, Mixtral 8x7B Q3

Synology officially supports specific RAM SKUs only and may complain if you install third-party RAM. The hardware accepts standard DDR4 SODIMMs (DS923+, DS1522+) or DDR4 ECC UDIMMs (DS1821+, DS3622xs+). DSM displays a warning at boot for unsupported RAM but boots and runs normally. I have run Crucial 16GB and 32GB modules in DS923+ and DS1821+ for over a year with zero issues.

Disclosure: unofficial RAM voids Synology hardware warranty support specifically for memory issues. If you need warranty coverage on RAM, buy the Synology-branded modules.

Container Manager Setup {#container-manager}

DSM 7.2 ships Container Manager. If yours says "Docker" instead, upgrade DSM first.

  1. Open Package Center, search Container Manager, install.
  2. Open Container Manager > Settings > General. Enable "Auto-start container after reboot."
  3. Open File Station and create /volume1/docker/ollama and /volume1/docker/open-webui.
  4. Note your NAS IP from Control Panel > Network. We will use 192.0.2.10 as a placeholder below - replace with yours.

Enable SSH (Control Panel > Terminal & SNMP > Enable SSH). Connect from your workstation:

ssh admin@192.0.2.10 -p 22

You will land in your home directory. Sudo to root for docker commands:

sudo -i

Deploying Ollama {#deploy-ollama}

Run Ollama as a long-lived container with a persistent volume so models survive restarts:

docker run -d \
  --name ollama \
  --restart always \
  -p 11434:11434 \
  -v /volume1/docker/ollama:/root/.ollama \
  -e OLLAMA_HOST=0.0.0.0 \
  -e OLLAMA_KEEP_ALIVE=30m \
  -e OLLAMA_NUM_PARALLEL=2 \
  --memory="12g" \
  --memory-swap="12g" \
  ollama/ollama:latest

Notes on the flags:

  • OLLAMA_HOST=0.0.0.0 makes the API listen on all interfaces so other LAN devices can reach it
  • OLLAMA_KEEP_ALIVE=30m keeps loaded models in RAM for half an hour after the last request, eliminating cold-start latency for back-to-back chats
  • --memory cap prevents Ollama from spilling into DSM territory and triggering memory pressure on your NAS services
  • The -v mount means models are stored on the NAS volume, not inside the container - reinstalling the container does not re-download

Pull a model:

docker exec -it ollama ollama pull phi3:mini
docker exec -it ollama ollama pull llama3.2:3b

Test it from your workstation, not the NAS:

curl http://192.0.2.10:11434/api/generate -d '{
  "model": "phi3:mini",
  "prompt": "Write a haiku about a NAS",
  "stream": false
}'

You should see a JSON response with token output and timing information. Token generation reported under eval_count / eval_duration (in nanoseconds) gives you tokens/second.

Adding Open WebUI {#open-webui}

Ollama's API is fine for tools like Continue.dev or Obsidian plugins. For a chat interface accessible from any device on your LAN, deploy Open WebUI alongside it:

docker run -d \
  --name open-webui \
  --restart always \
  -p 3000:8080 \
  -v /volume1/docker/open-webui:/app/backend/data \
  -e OLLAMA_BASE_URL=http://192.0.2.10:11434 \
  -e WEBUI_AUTH=true \
  --add-host=host.docker.internal:host-gateway \
  ghcr.io/open-webui/open-webui:main

Visit http://192.0.2.10:3000 from any browser on your LAN. Create an admin account on first visit. Open WebUI auto-discovers the Ollama endpoint and lists your installed models.

For a deeper Open WebUI configuration walkthrough including RAG over your documents, image analysis, and multi-user permissions, see the dedicated guide.

DSM Reverse Proxy and HTTPS {#reverse-proxy}

Open WebUI on plain HTTP is fine for a private LAN, but if you want to use Synology's Let's Encrypt cert and a clean URL like https://ai.yourdomain.com, use the built-in reverse proxy.

  1. DSM Control Panel > Login Portal > Advanced > Reverse Proxy
  2. Click Create
  3. Source:
    • Protocol: HTTPS
    • Hostname: ai.yourdomain.com (or ai.local for LAN-only)
    • Port: 443
  4. Destination:
    • Protocol: HTTP
    • Hostname: localhost
    • Port: 3000
  5. Custom Header tab > Create > WebSocket. This is required for Open WebUI's streaming responses.
  6. Click OK.

Enable HSTS in DSM Network > DSM Settings if you control DNS. Generate a Let's Encrypt cert via Control Panel > Security > Certificate > Add > Get a certificate from Let's Encrypt.

For LAN-only deployments where you do not have a domain, edit your router's DNS to point ai.local at the NAS IP, then use a self-signed certificate.

Benchmarks Across Models {#benchmarks}

All benchmarks measured on stock CPU inference (no GPU offload available), 256-token output, 16K context window, default sampling parameters, fresh model load.

NASRAMModelTok/sFirst TokenVerdict
DS220+6 GBPhi-3 Mini Q46.23.1sTight, usable for short prompts
DS220+6 GBLlama 3.2 3B Q44.83.8sBorderline, prefer Phi-3
DS923+16 GBPhi-3 Mini Q414.11.6sSmooth
DS923+16 GBLlama 3.2 3B Q411.71.9sDaily driver
DS923+16 GBMistral 7B Q46.83.4sReasoning quality, acceptable speed
DS923+32 GBLlama 3.1 8B Q45.94.1sBest quality, patience required
DS1522+32 GBMistral 7B Q47.23.2sSame CPU as DS923+, slightly more headroom
DS1821+32 GBLlama 3.1 13B Q44.35.8sPossible but slow
DS1821+32 GBMixtral 8x7B Q33.17.2sAt the edge of usable

A few takeaways:

  • The DS923+ with 16GB RAM and Phi-3 Mini is the value sweet spot. 14 tok/s is faster than human reading speed and the first-token latency under 2 seconds feels interactive.
  • Going from 16GB to 32GB on the DS923+ does not speed up smaller models - they already fit. It opens the door to 8B and 13B models at the cost of ~3x slower generation.
  • The DS1821+ has more cores but the same per-core speed. Throughput on a single inference is identical to DS923+; you only benefit if you serve multiple concurrent requests.

For comparison, the same Mistral 7B Q4 hits 18 tok/s on a Ryzen 5600 desktop and 35 tok/s on an M2 Air. NAS inference is roughly 2.5x slower than a budget desktop, but the NAS is already running 24/7.

Storage and Snapshot Strategy {#storage}

Models are large but compressible-once-on-disk. A few practical rules:

Put models on SSD volume if possible. Even on Plus-series NAS units with rotational drives, you can add a single NVMe and create a separate Btrfs volume for Docker and AI models. Model load time on a 7B model drops from 18 seconds (HDD) to 3 seconds (SSD). After load, inference speed is identical because everything lives in RAM.

Disable Btrfs snapshots on the model directory. Models do not change after download. Snapshotting them wastes space and adds I/O for no benefit. In DSM > Snapshot Replication, exclude /volume1/docker/ollama/models.

Backup the Open WebUI volume, not the Ollama volume. Models are re-downloadable; chat history and RAG indexes are not. Use Hyper Backup to back up only /volume1/docker/open-webui.

Plan disk space for model collection. A typical home AI setup ends up with Phi-3 (2GB), Llama 3.2 3B (2GB), Mistral 7B (4GB), an embedding model (300MB), and maybe Stable Diffusion (4GB). Budget 15-20 GB minimum.

Pitfalls and Fixes {#pitfalls}

Container Manager says "image cannot be pulled." DNS issue. DSM > Control Panel > Network > General > set DNS to 1.1.1.1 and 8.8.8.8.

docker: command not found. You are not running as root or sudo. Run sudo -i first. Synology DSM does not put docker on the default user PATH.

Ollama crashes after pulling a 7B model. Out of memory. Check free -h. If "available" drops below 1GB during model load, lower the Ollama --memory cap or pick a smaller model.

LAN devices cannot reach port 11434. Synology firewall. Control Panel > Security > Firewall > Edit Rules > Allow > Custom Ports > TCP > 11434. Apply to your LAN profile only, never to the WAN profile.

Token generation is 3x slower than expected. Check whether DSM Photos or Surveillance Station is actively re-indexing in the background. Both consume a lot of CPU and steal cycles from inference. Pause indexing during heavy AI use.

Open WebUI shows "no models found". OLLAMA_BASE_URL pointing to localhost from inside the Open WebUI container resolves to the container itself, not the Ollama container. Use the NAS IP or http://host.docker.internal:11434 with --add-host=host.docker.internal:host-gateway.

HTTPS reverse proxy strips streaming responses. You forgot the WebSocket custom header. Edit the reverse proxy rule, add Custom Header > WebSocket.

RAM upgrade not detected. Synology firmware whitelists. Do a full power cycle (not just reboot - shutdown, unplug, wait 30 seconds, plug back in). DSM redetects memory at cold boot.


Frequently Asked Questions

Q: Will running AI shorten the lifespan of my NAS drives?

A: No. AI inference reads model files into RAM once and runs entirely in memory afterward. Your drives sit idle during the actual inference. The only sustained drive activity is during model downloads, which is no different from a Plex library scan.

Q: Can I use a Synology DiskStation as a GPU passthrough box?

A: No. None of the consumer Synology models support GPU passthrough or PCIe expansion. For GPU-accelerated inference you need a custom build or mini-PC. See the AI hardware requirements guide for alternatives.

Q: How does this compare to running Ollama on a Raspberry Pi 5?

A: A Pi 5 with 8GB RAM hits roughly 4 tok/s on Phi-3 Mini. A DS923+ with 16GB hits 14 tok/s. The NAS wins on both speed and RAM ceiling. The Pi wins on power consumption (5W vs 25W) and price ($80 vs $700).

Q: Can multiple users hit the same Ollama endpoint at once?

A: Yes, with caveats. Ollama queues requests by default. Set OLLAMA_NUM_PARALLEL=2 or higher to allow concurrent inference, but expect throughput per user to drop proportionally - the NAS CPU is the bottleneck.

Q: Will this work over Synology Drive or SMB?

A: The Ollama API speaks HTTP, not SMB. You access it from clients via HTTP (curl, Open WebUI, Continue.dev, etc.). Drive and SMB are for file sharing, which is unrelated.

Q: Is the data really private if I use Synology QuickConnect?

A: QuickConnect routes your traffic through Synology relay servers when direct connections fail. For genuinely private AI, disable QuickConnect on the AI subdomain or use the LAN-only reverse proxy setup. The Ollama models themselves never call home.

Q: Can I run Stable Diffusion on a Synology NAS?

A: Yes for SD 1.5 at very slow speeds (90+ seconds per image at 512x512 on a DS923+). Practically not usable. SDXL needs a real GPU. The NAS is best for text models only.

Q: What about Plex Hardware Transcoding sharing the same Intel iGPU?

A: Plex transcoding uses Intel Quick Sync, which is a fixed-function video block, not the general-purpose GPU. It does not interfere with CPU-based AI inference, even when both run simultaneously.


Conclusion

A Synology Plus-series NAS is a $600-1500 box that already lives in your home, already runs 24/7, and already has a stable network presence. Adding Ollama via Container Manager turns it into a private LAN AI endpoint for the cost of a $60 RAM upgrade and ten minutes of setup. You get GPT-3.5-class quality from Mistral 7B at 7 tokens per second, fully offline, with HTTPS, with redundant storage, and with whatever else you already host on the same hardware.

The right pick for most home users is a DS923+ with 16GB or 32GB of third-party RAM running Phi-3 Mini for fast responses and Mistral 7B for harder questions. Layer Open WebUI on top for a clean interface, point a DSM reverse proxy at it for HTTPS, and you have a self-hosted Anthropic-or-OpenAI replacement for routine work that never leaves your network.

For the next step, check out which Ollama models fit your memory budget and the Open WebUI configuration deep dive.


Want more self-hosted and home-lab AI guides? Join the LocalAIMaster newsletter for weekly hardware and deployment walkthroughs.

🎯
AI Learning Path

Go from reading about AI to building with AI

10 structured courses. Hands-on projects. Runs on your machine. Start free.

Enjoyed this? There are 10 full courses waiting.

10 complete AI courses. From fundamentals to production. Everything runs on your hardware.

Reading now
Join the discussion

LocalAimaster Research Team

Creator of Local AI Master. I've built datasets with over 77,000 examples and trained AI models from scratch. Now I help people achieve AI independence through local AI mastery.

Build Real AI on Your Machine

RAG, agents, NLP, vision, and MLOps - chapters across 10 courses that take you from reading about AI to building AI.

Want structured AI education?

10 courses, 160+ chapters, from $9. Understand AI, don't just use it.

AI Learning Path

Comments (0)

No comments yet. Be the first to share your thoughts!

📅 Published: April 23, 2026🔄 Last Updated: April 23, 2026✓ Manually Reviewed
PR

Written by Pattanaik Ramswarup

Creator of Local AI Master

I build Local AI Master around practical, testable local AI workflows: model selection, hardware planning, RAG systems, agents, and MLOps. The goal is to turn scattered tutorials into a structured learning path you can follow on your own hardware.

✓ Local AI Curriculum✓ Hands-On Projects✓ Open Source Contributor

Was this helpful?

More Self-Hosting Guides

Get weekly home-lab AI walkthroughs delivered to your inbox.

Related Guides

Continue your local AI journey with these comprehensive guides

Build Real AI on Your Machine

RAG, agents, NLP, vision, and MLOps - chapters across 10 courses that take you from reading about AI to building AI.

Continue Learning

📚
Free · no account required

Grab the AI Starter Kit — career roadmap, cheat sheet, setup guide

No spam. Unsubscribe with one click.

🎯
AI Learning Path

Go from reading about AI to building with AI

10 structured courses. Hands-on projects. Runs on your machine. Start free.

Free Tools & Calculators