Why does Ollama say 'model requires more system memory'?

This error means the model needs more RAM than your system currently has available — not total installed RAM, but free RAM. Close memory-heavy apps like browsers (Chrome with many tabs can use 4-8 GB), use a smaller model (3B instead of 7B), or use a more aggressively quantized version like Q4_0 or Q2_K. Check free memory with 'free -h' on Linux, 'vm_stat' on Mac, or Task Manager on Windows.

How do I fix 'bind: address already in use' in Ollama?

Another process is using port 11434. Run 'lsof -i :11434' (Mac/Linux) or 'Get-NetTCPConnection -LocalPort 11434' (Windows) to find it. Usually it is another Ollama instance. Kill it with 'pkill -f ollama' and restart. Alternatively, run Ollama on a different port: 'OLLAMA_HOST=127.0.0.1:11435 ollama serve'.

Why does Ollama not detect my NVIDIA GPU?

Three common causes: (1) NVIDIA driver is too old — Ollama needs driver version 525 or newer for CUDA 12.x support, check with 'nvidia-smi'. (2) On Linux, the ollama user may lack GPU group membership — run 'sudo usermod -aG video,render ollama'. (3) After a kernel update on Linux, NVIDIA modules need rebuilding — run 'sudo apt install --reinstall nvidia-driver-550' and reboot.

How do I fix 'failed to load model' in Ollama?

This has multiple causes. Try in order: (1) Delete and re-pull the model with 'ollama rm modelname && ollama pull modelname'. (2) Update Ollama to the latest version — older versions may not support newer GGUF formats. (3) Fix permissions with 'sudo chown -R $(whoami) ~/.ollama'. (4) If on a GPU system, check that CUDA/ROCm drivers match your Ollama version.

Why is Ollama running slow at 2-3 tokens per second?

The model is likely running on CPU instead of GPU, or the model is too large and spilling layers to CPU. Check GPU usage during inference with 'nvidia-smi' (NVIDIA) or 'sudo powermetrics --samplers gpu_power' (Mac). Also check for thermal throttling on laptops. If the model does not fit in VRAM, use a smaller model or enable partial GPU offloading.

Can I move Ollama models to a different drive?

Yes. On Linux/Mac, move ~/.ollama to the target drive and create a symlink: 'mv ~/.ollama /mnt/big-drive/ollama && ln -s /mnt/big-drive/ollama ~/.ollama'. On Windows, set the OLLAMA_MODELS environment variable to the new path, like 'D:\OllamaModels'. Restart Ollama after making the change.

How do I completely reset Ollama and start fresh?

Stop Ollama (pkill -f ollama), delete all data (rm -rf ~/.ollama on Mac/Linux or delete %LOCALAPPDATA%\Ollama on Windows), uninstall, then reinstall from scratch. This removes all downloaded models and configuration, so you will need to re-download any models you use.

Why does Ollama output gibberish or incoherent text?

Usually caused by extreme quantization (Q2_K or IQ1 formats lose too much quality), using a model too small for the task (1B models cannot produce coherent long text), or a corrupted model file. Try a higher quality quantization like Q4_K_M, use a larger model (7B minimum for general text), or delete and re-pull the model.

Ollama Not Working? Fix Every Error (2026 Guide)

Published on April 11, 2026 • 18 min read

Quick answer: If Ollama is not working, the cause is almost always one of three things — not enough free RAM, the wrong or outdated GPU drivers (Ollama needs NVIDIA driver 525+ for CUDA 12.x), or a network/registry problem during model download. Confirm Ollama is installed with ollama --version, confirm the server responds with curl http://localhost:11434/api/tags, then match your exact error message to the section below. The most common single error, "model requires more system memory," is fixed by closing memory-heavy apps or using a smaller or more-quantized model.

I maintain Ollama across 14 machines — three Linux servers, a handful of Macs, and a few Windows workstations. Every error message in this guide is one I have personally hit, diagnosed, and fixed. If Ollama is broken on your machine right now, start with the diagnostic flowchart below and follow the branch that matches your situation.

Diagnostic Flowchart: Find Your Problem Fast

Work through these checkpoints in order. Each "No" answer points you to the right section.

Checkpoint 1 — Is Ollama installed?

ollama --version

Command not found? Jump to Installation Failures.
Version prints correctly? Move to Checkpoint 2.

Checkpoint 2 — Is the Ollama server running?

curl http://localhost:11434/api/tags

Connection refused? Jump to Server Won't Start.
Returns JSON with your models? Move to Checkpoint 3.

Checkpoint 3 — Can you pull a model?

ollama pull llama3.2:3b

Network error or stall? Jump to Download and Network Issues.
Disk space error? Jump to Storage Problems.
Model downloads fine? Move to Checkpoint 4.

Checkpoint 4 — Can you run the model?

ollama run llama3.2:3b "Say hello"

"model requires more system memory"? Jump to Memory Errors.
"failed to load model"? Jump to Model Load Failures.
"GPU not found" or CUDA errors? Jump to GPU Detection Issues.
"context length exceeded"? Jump to Context Length Errors.
Model runs but output is garbage? Jump to Bad Output Quality.
Model runs but is painfully slow? Jump to Performance Problems.

Reading articles is good. Building is better.

Free account = 20+ free chapters across 20 courses, with a per-chapter AI tutor. No card. Cancel anytime if you ever upgrade.

Start free in 30 seconds See pricing

Installation Failures {#installation-failures}

Windows: "ollama" is not recognized

The installer finished, but your terminal cannot find the binary. Three possible causes.

Cause 1: PATH not updated. The Ollama installer adds itself to the system PATH, but open terminal sessions do not pick up PATH changes automatically. Close every terminal window and open a fresh one. If you are using Windows Terminal, close the entire app — not just the tab.

Cause 2: Installer failed silently. Re-download from ollama.com/download/windows and run the installer as Administrator. Right-click the .exe, select "Run as administrator."

Cause 3: Antivirus quarantined the binary. Windows Defender sometimes flags Ollama. Check your quarantine list:

# Check Windows Defender quarantine
Get-MpThreatDetection | Select-Object -Last 5

# Add exclusion for Ollama
Add-MpPreference -ExclusionPath "C:\Users\$env:USERNAME\AppData\Local\Ollama"

For a complete walkthrough of Windows installation, see our Ollama Windows installation guide.

Mac: "command not found: ollama"

Homebrew install: Verify Homebrew itself works (brew --version). If yes:

brew reinstall ollama
# For Apple Silicon, ensure Homebrew path is set
echo 'eval "$(/opt/homebrew/bin/brew shellenv)"' >> ~/.zprofile
source ~/.zprofile

Direct download install: The .app bundle puts the binary in a non-standard location. Add it manually:

sudo ln -sf /Applications/Ollama.app/Contents/Resources/ollama /usr/local/bin/ollama

Gatekeeper blocking: If macOS refuses to open Ollama because it is from an unidentified developer:

xattr -cr /Applications/Ollama.app

Full Mac setup walkthrough: Mac local AI setup guide.

Linux: "ollama: command not found"

Snap/apt install: Check that /usr/local/bin is in your PATH:

echo $PATH | tr ':' '\n' | grep -q '/usr/local/bin' && echo "OK" || echo "MISSING"

# If missing, add it
echo 'export PATH="/usr/local/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc

curl install script failed: The official one-liner sometimes breaks on minimal server images missing curl or systemd:

# Install prerequisites first
sudo apt update && sudo apt install -y curl systemd

# Then run the installer
curl -fsSL https://ollama.com/install.sh | sh

# Verify
ollama --version

Server Won't Start {#server-wont-start}

Error: "bind: address already in use"

Port 11434 is taken. Another Ollama instance is already running, or another service grabbed the port.

# Find what is using port 11434
# Linux/Mac:
lsof -i :11434

# Windows (PowerShell):
Get-NetTCPConnection -LocalPort 11434 | Select-Object OwningProcess
Get-Process -Id (Get-NetTCPConnection -LocalPort 11434).OwningProcess

Fix 1: Kill the existing process.

# Linux/Mac
pkill -f ollama
# Wait 2 seconds, then start again
ollama serve

# Windows
taskkill /F /IM ollama.exe
ollama serve

Fix 2: Run on a different port.

# Linux/Mac
OLLAMA_HOST=127.0.0.1:11435 ollama serve

# Then tell the client about the new port
export OLLAMA_HOST=127.0.0.1:11435
ollama list

Server starts then immediately exits

Check the logs for the actual error:

# Linux (systemd)
journalctl -u ollama -n 50 --no-pager

# Mac (Homebrew service)
cat ~/.ollama/logs/server.log

# Windows
type "%LOCALAPPDATA%\Ollama\server.log"

Common causes: corrupted model files (delete ~/.ollama/models and re-pull), or NVIDIA driver mismatch on Linux (covered in GPU section below).

Download and Network Issues {#download-network-issues}

Pull hangs or times out

# Test basic connectivity
curl -v https://registry.ollama.ai/v2/

# If behind a proxy, set the environment variables
export HTTP_PROXY=http://your-proxy:8080
export HTTPS_PROXY=http://your-proxy:8080
export NO_PROXY=localhost,127.0.0.1

# Then retry
ollama pull llama3.2

Download speed is extremely slow

# Check your actual download speed
curl -o /dev/null -w "Speed: %{speed_download} bytes/sec\n" https://registry.ollama.ai/v2/

# Try a different DNS
# Linux:
sudo bash -c 'echo "nameserver 1.1.1.1" > /etc/resolv.conf'

# Mac:
sudo networksetup -setdnsservers Wi-Fi 1.1.1.1 8.8.8.8

Corporate firewall blocks the registry

Your IT department may block registry.ollama.ai. Two workarounds:

Option 1: Download the model on a personal network, then copy it:

# On unrestricted machine
ollama pull llama3.2
# Copy the model blob
tar -czf llama3.2-model.tar.gz ~/.ollama/models/

# Transfer to restricted machine and extract
tar -xzf llama3.2-model.tar.gz -C ~/

Option 2: Import a GGUF file directly (download from HuggingFace, which may not be blocked):

# Create a Modelfile
echo 'FROM ./llama-3.2-3b.Q4_K_M.gguf' > Modelfile
ollama create my-llama -f Modelfile

For ongoing issues, check Ollama's GitHub issues — connectivity problems sometimes stem from registry outages.

Reading articles is good. Building is better.

Free account = 20+ free chapters across 20 courses, with a per-chapter AI tutor. No card. Cancel anytime if you ever upgrade.

Start free in 30 seconds See pricing

Storage Problems {#storage-problems}

"not enough disk space" or download fails partway

Models are larger than you think. Here is what you need:

Model	Download Size	Disk After Install
llama3.2:3b	2.0 GB	2.0 GB
llama3.2:8b	4.7 GB	4.7 GB
llama3.3:70b	40 GB	40 GB
mixtral:8x7b	26 GB	26 GB
deepseek-r1:32b	19 GB	19 GB

Check your available space:

# Linux/Mac
df -h ~/.ollama

# Windows (PowerShell)
Get-PSDrive C | Select-Object Used, Free

Move the model directory to a bigger drive:

# Linux/Mac — move to external or secondary drive
mv ~/.ollama /mnt/big-drive/ollama
ln -s /mnt/big-drive/ollama ~/.ollama

# Windows — set environment variable
# System Settings → Environment Variables → New
# Variable: OLLAMA_MODELS
# Value: D:\OllamaModels

Clean up old models

# List all models with sizes
ollama list

# Remove models you no longer need
ollama rm mixtral:8x7b
ollama rm codellama:34b

# Check recovered space
du -sh ~/.ollama/models/

Memory Errors {#memory-errors}

"model requires more system memory"

This is the single most common error. The model needs more RAM than your system has available. Not total RAM — available RAM.

Check available memory:

# Linux
free -h | grep Mem

# Mac
vm_stat | perl -ne '/Pages free:\s+(\d+)/ && printf "Free: %.1f GB\n", $1*4096/1073741824'

# Windows (PowerShell)
(Get-CimInstance Win32_OperatingSystem).FreePhysicalMemory / 1MB

Fix 1: Close applications. Browsers are the biggest offenders. Chrome with 20 tabs can consume 4-8 GB. Close it. Close Slack, Teams, Docker Desktop — anything you are not actively using.

Fix 2: Use a smaller model. This is not a workaround — it is the correct solution if your hardware cannot support the model. Consult our system requirements guide for exact RAM-to-model mappings.

Available RAM	Maximum Model Size
4 GB	1B-3B (phi3:mini, llama3.2:1b)
8 GB	3B-7B (llama3.2:3b, gemma2:2b)
16 GB	7B-13B (llama3.2:8b, mistral)
32 GB	13B-30B (llama3.1:13b, deepseek-r1:14b)
64 GB	30B-70B (llama3.3:70b, mixtral:8x7b)

Fix 3: Use a more aggressively quantized version. Q4_0 uses roughly half the memory of Q8_0:

# Instead of the default (usually Q4_K_M)
ollama pull llama3.2:8b-q4_0

# Or even smaller — Q2 is lossy but runs on tight systems
ollama pull llama3.2:8b-q2_K

Fix 4: Reduce context length. Larger context windows eat RAM. The default 2048 tokens is usually fine:

ollama run llama3.2 --num-ctx 1024

Model Load Failures {#model-load-failures}

"failed to load model"

This error has at least five different root causes. Work through them in order.

Cause 1: Corrupted download. Delete and re-pull.

ollama rm llama3.2
ollama pull llama3.2

Cause 2: Incompatible GGUF file. If you created the model from a manually downloaded GGUF, the file may use a quantization format your Ollama version does not support. Update Ollama:

# Linux
curl -fsSL https://ollama.com/install.sh | sh

# Mac
brew upgrade ollama

# Windows — download latest installer from ollama.com

Cause 3: Permission denied. The model directory has wrong ownership (common after running Ollama as root, then as a regular user):

# Linux/Mac
sudo chown -R $(whoami) ~/.ollama
chmod -R 755 ~/.ollama/models

Cause 4: Disk full mid-download. The model file is incomplete. Remove it and re-download:

# Nuclear option — clear all models and re-download what you need
rm -rf ~/.ollama/models/blobs/*
ollama pull llama3.2

Cause 5: CUDA/ROCm library mismatch (GPU systems only). See the GPU section below.

GPU Detection Issues {#gpu-detection-issues}

"GPU not found" / "no compatible GPUs were discovered"

NVIDIA on Linux — the driver dance:

# Check if NVIDIA driver is loaded
nvidia-smi

# If "command not found" — install the driver
sudo apt install -y nvidia-driver-550
sudo reboot

# If nvidia-smi works but Ollama ignores GPU — check CUDA
ls /usr/local/cuda/lib64/libcudart*

# Ollama bundles its own CUDA runtime since v0.3.0
# But driver version must be >= 525.60.13
nvidia-smi | grep "Driver Version"

AMD on Linux:

# ROCm must be installed
rocm-smi

# If missing, install ROCm 6.x
# See: https://rocm.docs.amd.com/projects/install-on-linux/en/latest/

# Verify Ollama sees the GPU
OLLAMA_DEBUG=1 ollama serve 2>&1 | grep -i "gpu\|rocm\|amd"

Windows — CUDA version mismatch:

The most common Windows GPU issue is having an old NVIDIA driver. Ollama requires driver 525+ for CUDA 12.x support.

# Check driver version
nvidia-smi

# If driver is < 525, update from nvidia.com/drivers
# After updating, restart your machine (not just Ollama)

Mac — no discrete GPU needed. Apple Silicon uses Metal automatically. If you see GPU errors on Mac, you are likely on an Intel Mac without a supported GPU. Ollama will fall back to CPU, which is fine:

# Force CPU mode if Metal is causing issues
OLLAMA_METAL=0 ollama serve

"CUDA out of memory"

Your GPU does not have enough VRAM for the model. Unlike system RAM, you cannot use swap space for VRAM.

# Check GPU VRAM
nvidia-smi --query-gpu=memory.total,memory.used,memory.free --format=csv

# Offload some layers to CPU (slower but works)
ollama run llama3.2 --num-gpu 20
# Reduce the number until it fits. 0 = full CPU.

Context Length Errors {#context-length-errors}

"context length exceeded" / "requested context length is too large"

You asked for more context than the model supports or your hardware can handle.

# Check a model's default and maximum context length
ollama show llama3.2 --modelfile | grep num_ctx

# Run with explicit context length
ollama run llama3.2 --num-ctx 4096

# For longer context, you need more RAM:
# 4096 tokens ≈ +0.5 GB RAM
# 8192 tokens ≈ +1 GB RAM
# 32768 tokens ≈ +4 GB RAM
# 131072 tokens ≈ +16 GB RAM

If you routinely need long context, pick a model designed for it. Llama 3.2 supports up to 128K context, but you need the RAM to back it up.

Bad Output Quality {#bad-output-quality}

Model returns gibberish or incoherent text

Cause 1: Wrong quantization. Ultra-low quantization (Q2_K, IQ1) degrades output quality significantly. Step up to Q4_K_M or higher:

ollama rm llama3.2:8b-q2_K
ollama pull llama3.2:8b-q4_K_M

Cause 2: Model too small for the task. A 1B parameter model will not write coherent long-form text. It is not broken — it is limited. Move to 7B or larger.

Cause 3: Corrupted model file. If output was previously fine and suddenly degraded, the model file may be partially corrupted:

ollama rm llama3.2
ollama pull llama3.2

Cause 4: Bad system prompt in Modelfile. If you created a custom model, a poorly written system prompt can derail output. Test with the base model first to isolate the issue:

# Test base model (no custom Modelfile)
ollama run llama3.2 "Summarize the benefits of exercise"

# If base model works fine, the problem is your Modelfile

Model ignores instructions or gives wrong language

This usually means you are using a model that was not trained for your task. Some models are English-only, some are code-only. Check the model card on the Ollama library.

Performance Problems {#performance-problems}

Tokens per second is painfully slow

If inference runs but at 2-3 tokens/second when you expected 20+, the problem is almost always one of these:

Problem 1: Model is running on CPU instead of GPU. Verify GPU is being used:

# Check during inference
# Linux NVIDIA:
watch -n 1 nvidia-smi

# Mac:
sudo powermetrics --samplers gpu_power -i 1000 -n 3

# If GPU utilization is 0% during inference, GPU is not being used
# Restart Ollama with debug logging:
OLLAMA_DEBUG=1 ollama serve

Problem 2: Model is too large and spilling to CPU. When a model does not fully fit in VRAM, Ollama splits layers between GPU and CPU. The CPU layers are dramatically slower. Either use a smaller model or reduce layers on GPU until throughput stabilizes:

# See how many layers loaded to GPU vs CPU
OLLAMA_DEBUG=1 ollama run llama3.2 "test" 2>&1 | grep -i "layer\|gpu\|cpu"

Problem 3: Other processes hogging GPU. Common culprit: a web browser using hardware acceleration, or a stuck CUDA process:

# Linux/Windows — find GPU-hogging processes
nvidia-smi

# Kill stuck processes
sudo kill -9 <PID>

Problem 4: Thermal throttling. Laptops throttle GPU and CPU when they get hot. Check temperatures:

# Linux
sensors | grep -i temp

# Mac
sudo powermetrics --samplers thermal -i 1000 -n 1

For a deep dive on every performance optimization, see our dedicated slow local LLM fix guide.

Platform-Specific Issues {#platform-specific-issues}

Windows-Specific Problems

WSL2 vs Native: If you installed Ollama inside WSL2, it may not see your GPU. You need the NVIDIA CUDA driver for WSL:

# Check if WSL sees GPU
wsl nvidia-smi

# If not, install CUDA WSL driver from:
# https://developer.nvidia.com/cuda/wsl

Windows Defender real-time scanning slows model loading dramatically. Exclude the Ollama directory:

Add-MpPreference -ExclusionPath "C:\Users\$env:USERNAME\.ollama"
Add-MpPreference -ExclusionProcess "ollama.exe"

Long path names: Windows has a 260-character path limit by default. If model paths are deep, enable long paths:

# Run as Administrator
New-ItemProperty -Path "HKLM:\SYSTEM\CurrentControlSet\Control\FileSystem" -Name "LongPathsEnabled" -Value 1 -PropertyType DWORD -Force

Mac-Specific Problems

Rosetta 2 overhead on M1: If you installed the x86 version of Ollama through an x86 Homebrew installation, it runs through Rosetta and loses 30-40% performance:

# Check architecture
file $(which ollama)
# Should say "arm64" not "x86_64"

# If x86, reinstall with native Homebrew
arch -arm64 brew reinstall ollama

macOS memory pressure: When macOS reports memory pressure as "yellow" or "red," Ollama gets throttled by the OS before it runs out of actual RAM:

# Check memory pressure
memory_pressure -S

# If "WARNING" — close apps and reduce model size
# Activity Monitor → Memory tab → sort by Memory to find hogs

Linux-Specific Problems

SELinux blocking Ollama:

# Check for SELinux denials
ausearch -m avc -ts recent | grep ollama

# Quick fix (permissive for Ollama only)
sudo semanage permissive -a ollama_t

# Or disable SELinux temporarily for testing
sudo setenforce 0

systemd service fails:

# Check service status
systemctl status ollama

# Common fix: the ollama user does not have GPU access
sudo usermod -aG video ollama
sudo usermod -aG render ollama
sudo systemctl restart ollama

NVIDIA driver mismatch after kernel update:

# After a kernel update, NVIDIA modules need rebuilding
sudo apt install --reinstall nvidia-driver-550
# Or if using DKMS:
sudo dkms autoinstall
sudo reboot

Environment Variable Reference {#env-var-reference}

When nothing else works, these environment variables control Ollama's behavior at a granular level. Set them before running ollama serve:

Variable	Purpose	Example
`OLLAMA_HOST`	Bind address and port	`127.0.0.1:11435`
`OLLAMA_MODELS`	Custom model storage path	`/mnt/ssd/models`
`OLLAMA_NUM_PARALLEL`	Concurrent request limit	`2`
`OLLAMA_MAX_LOADED_MODELS`	Models kept in memory	`1`
`OLLAMA_DEBUG`	Verbose logging	`1`
`OLLAMA_METAL`	Enable/disable Metal (Mac)	`0` or `1`
`CUDA_VISIBLE_DEVICES`	Select specific GPU	`0` or `0,1`
`OLLAMA_KEEP_ALIVE`	Model unload timeout	`5m` or `0`
`OLLAMA_FLASH_ATTENTION`	Enable flash attention	`1`

# Example: start Ollama with debug logging, custom port, single model loaded
OLLAMA_DEBUG=1 OLLAMA_HOST=0.0.0.0:11435 OLLAMA_MAX_LOADED_MODELS=1 ollama serve

Nuclear Options: Full Reset {#nuclear-options}

If nothing above works and you want to start completely clean:

# 1. Stop Ollama
pkill -f ollama          # Linux/Mac
taskkill /F /IM ollama.exe  # Windows

# 2. Remove all data
rm -rf ~/.ollama         # Linux/Mac

# Windows:
# Delete: %LOCALAPPDATA%\Ollama
# Delete: %USERPROFILE%\.ollama

# 3. Uninstall
# Linux:
sudo rm /usr/local/bin/ollama
# Mac:
brew uninstall ollama
# Windows:
# Use Add/Remove Programs

# 4. Reinstall fresh
# Linux:
curl -fsSL https://ollama.com/install.sh | sh
# Mac:
brew install ollama
# Windows:
# Download from ollama.com/download/windows

# 5. Test
ollama --version
ollama pull llama3.2:3b
ollama run llama3.2:3b "Hello, is this working?"

Getting More Help

If you have worked through every section above and your issue persists:

Collect debug logs: Run OLLAMA_DEBUG=1 ollama serve and capture the full output.
Check GitHub issues: Search github.com/ollama/ollama/issues for your specific error message.
Post a bug report: Include your OS, Ollama version (ollama --version), hardware specs, and the exact error message with debug logs.

Most issues boil down to one of three things: not enough RAM, wrong GPU drivers, or network problems. Fix those and Ollama runs reliably for months at a time.

Running into performance problems specifically? Our local LLM slow fix guide covers every optimization technique from quantization selection to GPU layer splitting. For hardware planning, check the Ollama system requirements page.

Ollama Not Working? Fix Every Error (2026 Guide)

Want to go deeper than this article?

Diagnostic Flowchart: Find Your Problem Fast

Reading articles is good. Building is better.

Installation Failures {#installation-failures}

Windows: "ollama" is not recognized

Mac: "command not found: ollama"

Linux: "ollama: command not found"

Server Won't Start {#server-wont-start}

Error: "bind: address already in use"

Server starts then immediately exits

Download and Network Issues {#download-network-issues}

Pull hangs or times out

Download speed is extremely slow

Corporate firewall blocks the registry

Reading articles is good. Building is better.

Storage Problems {#storage-problems}

"not enough disk space" or download fails partway

Clean up old models

Memory Errors {#memory-errors}

"model requires more system memory"

Model Load Failures {#model-load-failures}

"failed to load model"

GPU Detection Issues {#gpu-detection-issues}

"GPU not found" / "no compatible GPUs were discovered"

"CUDA out of memory"

Context Length Errors {#context-length-errors}

"context length exceeded" / "requested context length is too large"

Bad Output Quality {#bad-output-quality}

Model returns gibberish or incoherent text

Model ignores instructions or gives wrong language

Performance Problems {#performance-problems}

Tokens per second is painfully slow

Platform-Specific Issues {#platform-specific-issues}

Windows-Specific Problems

Mac-Specific Problems

Linux-Specific Problems

Environment Variable Reference {#env-var-reference}

Nuclear Options: Full Reset {#nuclear-options}

Getting More Help

Ollama’s running. Here’s what to build with it.

Liked this? 20 full AI courses are waiting.

Local AI Master Research Team

Build Real AI on Your Machine

Want structured AI education?

Continue Your Local AI Journey

How to Install Your First Local AI Model

How to Choose the Right AI Model for Your Computer

Comments (0)

Written by the Local AI Master Team

Stop Guessing, Start Fixing

Build Real AI on Your Machine

🎓 Continue Learning

Related Guides

Ollama Windows Installation Guide

Mac Local AI Setup Guide

Fix Slow Local LLM Performance

Ollama System Requirements

Grab the AI Starter Kit — career roadmap, cheat sheet, setup guide

Ollama’s running. Here’s what to build with it.