Troubleshooting

Ollama Not Working? Complete Troubleshooting Guide

April 11, 2026
18 min read
Local AI Master Research Team

Want to go deeper than this article?

The AI Learning Path covers this topic and more — hands-on chapters across 10 courses across 10 courses.

Ollama Not Working? Complete Troubleshooting Guide

Published on April 11, 2026 • 18 min read

I maintain Ollama across 14 machines — three Linux servers, a handful of Macs, and a few Windows workstations. Every error message in this guide is one I have personally hit, diagnosed, and fixed. If Ollama is broken on your machine right now, start with the diagnostic flowchart below and follow the branch that matches your situation.


Diagnostic Flowchart: Find Your Problem Fast

Work through these checkpoints in order. Each "No" answer points you to the right section.

Checkpoint 1 — Is Ollama installed?

ollama --version

Checkpoint 2 — Is the Ollama server running?

curl http://localhost:11434/api/tags
  • Connection refused? Jump to Server Won't Start.
  • Returns JSON with your models? Move to Checkpoint 3.

Checkpoint 3 — Can you pull a model?

ollama pull llama3.2:3b

Checkpoint 4 — Can you run the model?

ollama run llama3.2:3b "Say hello"

Installation Failures {#installation-failures}

Windows: "ollama" is not recognized

The installer finished, but your terminal cannot find the binary. Three possible causes.

Cause 1: PATH not updated. The Ollama installer adds itself to the system PATH, but open terminal sessions do not pick up PATH changes automatically. Close every terminal window and open a fresh one. If you are using Windows Terminal, close the entire app — not just the tab.

Cause 2: Installer failed silently. Re-download from ollama.com/download/windows and run the installer as Administrator. Right-click the .exe, select "Run as administrator."

Cause 3: Antivirus quarantined the binary. Windows Defender sometimes flags Ollama. Check your quarantine list:

# Check Windows Defender quarantine
Get-MpThreatDetection | Select-Object -Last 5

# Add exclusion for Ollama
Add-MpPreference -ExclusionPath "C:\Users\$env:USERNAME\AppData\Local\Ollama"

For a complete walkthrough of Windows installation, see our Ollama Windows installation guide.

Mac: "command not found: ollama"

Homebrew install: Verify Homebrew itself works (brew --version). If yes:

brew reinstall ollama
# For Apple Silicon, ensure Homebrew path is set
echo 'eval "$(/opt/homebrew/bin/brew shellenv)"' >> ~/.zprofile
source ~/.zprofile

Direct download install: The .app bundle puts the binary in a non-standard location. Add it manually:

sudo ln -sf /Applications/Ollama.app/Contents/Resources/ollama /usr/local/bin/ollama

Gatekeeper blocking: If macOS refuses to open Ollama because it is from an unidentified developer:

xattr -cr /Applications/Ollama.app

Full Mac setup walkthrough: Mac local AI setup guide.

Linux: "ollama: command not found"

Snap/apt install: Check that /usr/local/bin is in your PATH:

echo $PATH | tr ':' '\n' | grep -q '/usr/local/bin' && echo "OK" || echo "MISSING"

# If missing, add it
echo 'export PATH="/usr/local/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc

curl install script failed: The official one-liner sometimes breaks on minimal server images missing curl or systemd:

# Install prerequisites first
sudo apt update && sudo apt install -y curl systemd

# Then run the installer
curl -fsSL https://ollama.com/install.sh | sh

# Verify
ollama --version

Server Won't Start {#server-wont-start}

Error: "bind: address already in use"

Port 11434 is taken. Another Ollama instance is already running, or another service grabbed the port.

# Find what is using port 11434
# Linux/Mac:
lsof -i :11434

# Windows (PowerShell):
Get-NetTCPConnection -LocalPort 11434 | Select-Object OwningProcess
Get-Process -Id (Get-NetTCPConnection -LocalPort 11434).OwningProcess

Fix 1: Kill the existing process.

# Linux/Mac
pkill -f ollama
# Wait 2 seconds, then start again
ollama serve

# Windows
taskkill /F /IM ollama.exe
ollama serve

Fix 2: Run on a different port.

# Linux/Mac
OLLAMA_HOST=127.0.0.1:11435 ollama serve

# Then tell the client about the new port
export OLLAMA_HOST=127.0.0.1:11435
ollama list

Server starts then immediately exits

Check the logs for the actual error:

# Linux (systemd)
journalctl -u ollama -n 50 --no-pager

# Mac (Homebrew service)
cat ~/.ollama/logs/server.log

# Windows
type "%LOCALAPPDATA%\Ollama\server.log"

Common causes: corrupted model files (delete ~/.ollama/models and re-pull), or NVIDIA driver mismatch on Linux (covered in GPU section below).


Download and Network Issues {#download-network-issues}

Pull hangs or times out

# Test basic connectivity
curl -v https://registry.ollama.ai/v2/

# If behind a proxy, set the environment variables
export HTTP_PROXY=http://your-proxy:8080
export HTTPS_PROXY=http://your-proxy:8080
export NO_PROXY=localhost,127.0.0.1

# Then retry
ollama pull llama3.2

Download speed is extremely slow

# Check your actual download speed
curl -o /dev/null -w "Speed: %{speed_download} bytes/sec\n" https://registry.ollama.ai/v2/

# Try a different DNS
# Linux:
sudo bash -c 'echo "nameserver 1.1.1.1" > /etc/resolv.conf'

# Mac:
sudo networksetup -setdnsservers Wi-Fi 1.1.1.1 8.8.8.8

Corporate firewall blocks the registry

Your IT department may block registry.ollama.ai. Two workarounds:

Option 1: Download the model on a personal network, then copy it:

# On unrestricted machine
ollama pull llama3.2
# Copy the model blob
tar -czf llama3.2-model.tar.gz ~/.ollama/models/

# Transfer to restricted machine and extract
tar -xzf llama3.2-model.tar.gz -C ~/

Option 2: Import a GGUF file directly (download from HuggingFace, which may not be blocked):

# Create a Modelfile
echo 'FROM ./llama-3.2-3b.Q4_K_M.gguf' > Modelfile
ollama create my-llama -f Modelfile

For ongoing issues, check Ollama's GitHub issues — connectivity problems sometimes stem from registry outages.


Storage Problems {#storage-problems}

"not enough disk space" or download fails partway

Models are larger than you think. Here is what you need:

ModelDownload SizeDisk After Install
llama3.2:3b2.0 GB2.0 GB
llama3.2:8b4.7 GB4.7 GB
llama3.3:70b40 GB40 GB
mixtral:8x7b26 GB26 GB
deepseek-r1:32b19 GB19 GB

Check your available space:

# Linux/Mac
df -h ~/.ollama

# Windows (PowerShell)
Get-PSDrive C | Select-Object Used, Free

Move the model directory to a bigger drive:

# Linux/Mac — move to external or secondary drive
mv ~/.ollama /mnt/big-drive/ollama
ln -s /mnt/big-drive/ollama ~/.ollama

# Windows — set environment variable
# System Settings → Environment Variables → New
# Variable: OLLAMA_MODELS
# Value: D:\OllamaModels

Clean up old models

# List all models with sizes
ollama list

# Remove models you no longer need
ollama rm mixtral:8x7b
ollama rm codellama:34b

# Check recovered space
du -sh ~/.ollama/models/

Memory Errors {#memory-errors}

"model requires more system memory"

This is the single most common error. The model needs more RAM than your system has available. Not total RAM — available RAM.

Check available memory:

# Linux
free -h | grep Mem

# Mac
vm_stat | perl -ne '/Pages free:\s+(\d+)/ && printf "Free: %.1f GB\n", $1*4096/1073741824'

# Windows (PowerShell)
(Get-CimInstance Win32_OperatingSystem).FreePhysicalMemory / 1MB

Fix 1: Close applications. Browsers are the biggest offenders. Chrome with 20 tabs can consume 4-8 GB. Close it. Close Slack, Teams, Docker Desktop — anything you are not actively using.

Fix 2: Use a smaller model. This is not a workaround — it is the correct solution if your hardware cannot support the model. Consult our system requirements guide for exact RAM-to-model mappings.

Available RAMMaximum Model Size
4 GB1B-3B (phi3:mini, llama3.2:1b)
8 GB3B-7B (llama3.2:3b, gemma2:2b)
16 GB7B-13B (llama3.2:8b, mistral)
32 GB13B-30B (llama3.1:13b, deepseek-r1:14b)
64 GB30B-70B (llama3.3:70b, mixtral:8x7b)

Fix 3: Use a more aggressively quantized version. Q4_0 uses roughly half the memory of Q8_0:

# Instead of the default (usually Q4_K_M)
ollama pull llama3.2:8b-q4_0

# Or even smaller — Q2 is lossy but runs on tight systems
ollama pull llama3.2:8b-q2_K

Fix 4: Reduce context length. Larger context windows eat RAM. The default 2048 tokens is usually fine:

ollama run llama3.2 --num-ctx 1024

Model Load Failures {#model-load-failures}

"failed to load model"

This error has at least five different root causes. Work through them in order.

Cause 1: Corrupted download. Delete and re-pull.

ollama rm llama3.2
ollama pull llama3.2

Cause 2: Incompatible GGUF file. If you created the model from a manually downloaded GGUF, the file may use a quantization format your Ollama version does not support. Update Ollama:

# Linux
curl -fsSL https://ollama.com/install.sh | sh

# Mac
brew upgrade ollama

# Windows — download latest installer from ollama.com

Cause 3: Permission denied. The model directory has wrong ownership (common after running Ollama as root, then as a regular user):

# Linux/Mac
sudo chown -R $(whoami) ~/.ollama
chmod -R 755 ~/.ollama/models

Cause 4: Disk full mid-download. The model file is incomplete. Remove it and re-download:

# Nuclear option — clear all models and re-download what you need
rm -rf ~/.ollama/models/blobs/*
ollama pull llama3.2

Cause 5: CUDA/ROCm library mismatch (GPU systems only). See the GPU section below.


GPU Detection Issues {#gpu-detection-issues}

"GPU not found" / "no compatible GPUs were discovered"

NVIDIA on Linux — the driver dance:

# Check if NVIDIA driver is loaded
nvidia-smi

# If "command not found" — install the driver
sudo apt install -y nvidia-driver-550
sudo reboot

# If nvidia-smi works but Ollama ignores GPU — check CUDA
ls /usr/local/cuda/lib64/libcudart*

# Ollama bundles its own CUDA runtime since v0.3.0
# But driver version must be >= 525.60.13
nvidia-smi | grep "Driver Version"

AMD on Linux:

# ROCm must be installed
rocm-smi

# If missing, install ROCm 6.x
# See: https://rocm.docs.amd.com/projects/install-on-linux/en/latest/

# Verify Ollama sees the GPU
OLLAMA_DEBUG=1 ollama serve 2>&1 | grep -i "gpu\|rocm\|amd"

Windows — CUDA version mismatch:

The most common Windows GPU issue is having an old NVIDIA driver. Ollama requires driver 525+ for CUDA 12.x support.

# Check driver version
nvidia-smi

# If driver is < 525, update from nvidia.com/drivers
# After updating, restart your machine (not just Ollama)

Mac — no discrete GPU needed. Apple Silicon uses Metal automatically. If you see GPU errors on Mac, you are likely on an Intel Mac without a supported GPU. Ollama will fall back to CPU, which is fine:

# Force CPU mode if Metal is causing issues
OLLAMA_METAL=0 ollama serve

"CUDA out of memory"

Your GPU does not have enough VRAM for the model. Unlike system RAM, you cannot use swap space for VRAM.

# Check GPU VRAM
nvidia-smi --query-gpu=memory.total,memory.used,memory.free --format=csv

# Offload some layers to CPU (slower but works)
ollama run llama3.2 --num-gpu 20
# Reduce the number until it fits. 0 = full CPU.

Context Length Errors {#context-length-errors}

"context length exceeded" / "requested context length is too large"

You asked for more context than the model supports or your hardware can handle.

# Check a model's default and maximum context length
ollama show llama3.2 --modelfile | grep num_ctx

# Run with explicit context length
ollama run llama3.2 --num-ctx 4096

# For longer context, you need more RAM:
# 4096 tokens ≈ +0.5 GB RAM
# 8192 tokens ≈ +1 GB RAM
# 32768 tokens ≈ +4 GB RAM
# 131072 tokens ≈ +16 GB RAM

If you routinely need long context, pick a model designed for it. Llama 3.2 supports up to 128K context, but you need the RAM to back it up.


Bad Output Quality {#bad-output-quality}

Model returns gibberish or incoherent text

Cause 1: Wrong quantization. Ultra-low quantization (Q2_K, IQ1) degrades output quality significantly. Step up to Q4_K_M or higher:

ollama rm llama3.2:8b-q2_K
ollama pull llama3.2:8b-q4_K_M

Cause 2: Model too small for the task. A 1B parameter model will not write coherent long-form text. It is not broken — it is limited. Move to 7B or larger.

Cause 3: Corrupted model file. If output was previously fine and suddenly degraded, the model file may be partially corrupted:

ollama rm llama3.2
ollama pull llama3.2

Cause 4: Bad system prompt in Modelfile. If you created a custom model, a poorly written system prompt can derail output. Test with the base model first to isolate the issue:

# Test base model (no custom Modelfile)
ollama run llama3.2 "Summarize the benefits of exercise"

# If base model works fine, the problem is your Modelfile

Model ignores instructions or gives wrong language

This usually means you are using a model that was not trained for your task. Some models are English-only, some are code-only. Check the model card on the Ollama library.


Performance Problems {#performance-problems}

Tokens per second is painfully slow

If inference runs but at 2-3 tokens/second when you expected 20+, the problem is almost always one of these:

Problem 1: Model is running on CPU instead of GPU. Verify GPU is being used:

# Check during inference
# Linux NVIDIA:
watch -n 1 nvidia-smi

# Mac:
sudo powermetrics --samplers gpu_power -i 1000 -n 3

# If GPU utilization is 0% during inference, GPU is not being used
# Restart Ollama with debug logging:
OLLAMA_DEBUG=1 ollama serve

Problem 2: Model is too large and spilling to CPU. When a model does not fully fit in VRAM, Ollama splits layers between GPU and CPU. The CPU layers are dramatically slower. Either use a smaller model or reduce layers on GPU until throughput stabilizes:

# See how many layers loaded to GPU vs CPU
OLLAMA_DEBUG=1 ollama run llama3.2 "test" 2>&1 | grep -i "layer\|gpu\|cpu"

Problem 3: Other processes hogging GPU. Common culprit: a web browser using hardware acceleration, or a stuck CUDA process:

# Linux/Windows — find GPU-hogging processes
nvidia-smi

# Kill stuck processes
sudo kill -9 <PID>

Problem 4: Thermal throttling. Laptops throttle GPU and CPU when they get hot. Check temperatures:

# Linux
sensors | grep -i temp

# Mac
sudo powermetrics --samplers thermal -i 1000 -n 1

For a deep dive on every performance optimization, see our dedicated slow local LLM fix guide.


Platform-Specific Issues {#platform-specific-issues}

Windows-Specific Problems

WSL2 vs Native: If you installed Ollama inside WSL2, it may not see your GPU. You need the NVIDIA CUDA driver for WSL:

# Check if WSL sees GPU
wsl nvidia-smi

# If not, install CUDA WSL driver from:
# https://developer.nvidia.com/cuda/wsl

Windows Defender real-time scanning slows model loading dramatically. Exclude the Ollama directory:

Add-MpPreference -ExclusionPath "C:\Users\$env:USERNAME\.ollama"
Add-MpPreference -ExclusionProcess "ollama.exe"

Long path names: Windows has a 260-character path limit by default. If model paths are deep, enable long paths:

# Run as Administrator
New-ItemProperty -Path "HKLM:\SYSTEM\CurrentControlSet\Control\FileSystem" -Name "LongPathsEnabled" -Value 1 -PropertyType DWORD -Force

Mac-Specific Problems

Rosetta 2 overhead on M1: If you installed the x86 version of Ollama through an x86 Homebrew installation, it runs through Rosetta and loses 30-40% performance:

# Check architecture
file $(which ollama)
# Should say "arm64" not "x86_64"

# If x86, reinstall with native Homebrew
arch -arm64 brew reinstall ollama

macOS memory pressure: When macOS reports memory pressure as "yellow" or "red," Ollama gets throttled by the OS before it runs out of actual RAM:

# Check memory pressure
memory_pressure -S

# If "WARNING" — close apps and reduce model size
# Activity Monitor → Memory tab → sort by Memory to find hogs

Linux-Specific Problems

SELinux blocking Ollama:

# Check for SELinux denials
ausearch -m avc -ts recent | grep ollama

# Quick fix (permissive for Ollama only)
sudo semanage permissive -a ollama_t

# Or disable SELinux temporarily for testing
sudo setenforce 0

systemd service fails:

# Check service status
systemctl status ollama

# Common fix: the ollama user does not have GPU access
sudo usermod -aG video ollama
sudo usermod -aG render ollama
sudo systemctl restart ollama

NVIDIA driver mismatch after kernel update:

# After a kernel update, NVIDIA modules need rebuilding
sudo apt install --reinstall nvidia-driver-550
# Or if using DKMS:
sudo dkms autoinstall
sudo reboot

Environment Variable Reference {#env-var-reference}

When nothing else works, these environment variables control Ollama's behavior at a granular level. Set them before running ollama serve:

VariablePurposeExample
OLLAMA_HOSTBind address and port127.0.0.1:11435
OLLAMA_MODELSCustom model storage path/mnt/ssd/models
OLLAMA_NUM_PARALLELConcurrent request limit2
OLLAMA_MAX_LOADED_MODELSModels kept in memory1
OLLAMA_DEBUGVerbose logging1
OLLAMA_METALEnable/disable Metal (Mac)0 or 1
CUDA_VISIBLE_DEVICESSelect specific GPU0 or 0,1
OLLAMA_KEEP_ALIVEModel unload timeout5m or 0
OLLAMA_FLASH_ATTENTIONEnable flash attention1
# Example: start Ollama with debug logging, custom port, single model loaded
OLLAMA_DEBUG=1 OLLAMA_HOST=0.0.0.0:11435 OLLAMA_MAX_LOADED_MODELS=1 ollama serve

Nuclear Options: Full Reset {#nuclear-options}

If nothing above works and you want to start completely clean:

# 1. Stop Ollama
pkill -f ollama          # Linux/Mac
taskkill /F /IM ollama.exe  # Windows

# 2. Remove all data
rm -rf ~/.ollama         # Linux/Mac

# Windows:
# Delete: %LOCALAPPDATA%\Ollama
# Delete: %USERPROFILE%\.ollama

# 3. Uninstall
# Linux:
sudo rm /usr/local/bin/ollama
# Mac:
brew uninstall ollama
# Windows:
# Use Add/Remove Programs

# 4. Reinstall fresh
# Linux:
curl -fsSL https://ollama.com/install.sh | sh
# Mac:
brew install ollama
# Windows:
# Download from ollama.com/download/windows

# 5. Test
ollama --version
ollama pull llama3.2:3b
ollama run llama3.2:3b "Hello, is this working?"

Getting More Help

If you have worked through every section above and your issue persists:

  1. Collect debug logs: Run OLLAMA_DEBUG=1 ollama serve and capture the full output.
  2. Check GitHub issues: Search github.com/ollama/ollama/issues for your specific error message.
  3. Post a bug report: Include your OS, Ollama version (ollama --version), hardware specs, and the exact error message with debug logs.

Most issues boil down to one of three things: not enough RAM, wrong GPU drivers, or network problems. Fix those and Ollama runs reliably for months at a time.


Running into performance problems specifically? Our local LLM slow fix guide covers every optimization technique from quantization selection to GPU layer splitting. For hardware planning, check the Ollama system requirements page.

🎯
AI Learning Path

Go from reading about AI to building with AI

10 structured courses. Hands-on projects. Runs on your machine. Start free.

Enjoyed this? There are 10 full courses waiting.

10 complete AI courses. From fundamentals to production. Everything runs on your hardware.

Reading now
Join the discussion

Local AI Master Research Team

Creator of Local AI Master. I've built datasets with over 77,000 examples and trained AI models from scratch. Now I help people achieve AI independence through local AI mastery.

Build Real AI on Your Machine

RAG, agents, NLP, vision, MLOps — chapters across 10 courses that take you from reading about AI to building AI.

Want structured AI education?

10 courses, 160+ chapters, from $9. Understand AI, don't just use it.

AI Learning Path

Comments (0)

No comments yet. Be the first to share your thoughts!

📅 Published: April 11, 2026🔄 Last Updated: April 11, 2026✓ Manually Reviewed
PR

Written by Pattanaik Ramswarup

AI Engineer & Dataset Architect | Creator of the 77,000 Training Dataset

I've personally trained over 50 AI models from scratch and spent 2,000+ hours optimizing local AI deployments. My 77K dataset project revolutionized how businesses approach AI training. Every guide on this site is based on real hands-on experience, not theory. I test everything on my own hardware before writing about it.

✓ 10+ Years in ML/AI✓ 77K Dataset Creator✓ Open Source Contributor

Stop Guessing, Start Fixing

Get troubleshooting guides, performance tips, and model recommendations delivered weekly.

Build Real AI on Your Machine

RAG, agents, NLP, vision, MLOps — chapters across 10 courses that take you from reading about AI to building AI.

Was this helpful?

🎯
AI Learning Path

Go from reading about AI to building with AI

10 structured courses. Hands-on projects. Runs on your machine. Start free.

Free Tools & Calculators