Ollama Latest Version & Changelog: What's New
Want to go deeper than this article?
The AI Learning Path covers this topic and more — hands-on chapters across 10 courses across 10 courses.
Ollama Version History & Changelog: Every Release Explained
Published on April 10, 2026 — 18 min read
Quick check: Run ollama --version right now. If it says anything below 0.6, you are missing critical performance improvements and model support. This page tracks every Ollama release so you can decide whether to update and what to expect when you do.
What you will find here:
- Current stable version and what shipped in it
- Full release timeline from v0.1.0 to present
- Breaking changes that might affect your setup
- How to check, update, and roll back versions
- Version comparison table with key features
Ollama moves fast. The project has shipped over 40 releases since its first public beta, and the pace has only accelerated. Some releases add model support. Others overhaul the inference engine or change API behavior. Knowing what changed — and what broke — saves you hours of debugging.
If you are setting up Ollama for the first time, start with our complete Ollama guide instead. This page is for people who already run Ollama and want to understand the release cadence.
Table of Contents
- How to Check Your Ollama Version
- Current Stable Release
- How to Update Ollama
- Complete Version Timeline
- Breaking Changes by Version
- Version Comparison Table
- How to Roll Back to a Previous Version
- Release Cadence and Roadmap
- Troubleshooting Update Issues
- FAQ
How to Check Your Ollama Version {#check-version}
Three ways to find your current version:
Command Line
# Primary method
ollama --version
# Output: ollama version is 0.6.2
# Alternative — the version subcommand
ollama version
API Endpoint
# If Ollama is running as a service
curl http://localhost:11434/api/version
# Output: {"version":"0.6.2"}
System-Specific Locations
# macOS — check the app bundle
mdls -name kMDItemVersion /Applications/Ollama.app
# Linux — check the binary directly
ollama --version
# Windows — PowerShell
ollama --version
# Or check: Settings → Apps → Ollama
If ollama --version returns nothing or errors out, your installation is older than v0.1.17 (the version that added the flag) or the binary is not on your PATH. See the troubleshooting section below.
Current Stable Release {#current-release}
Ollama v0.6.2 (March 2026)
The latest stable release as of April 2026. Key additions:
New model support:
- Llama 4 Scout and Maverick (Meta's latest mixture-of-experts family)
- Qwen 3 (all sizes from 0.6B to 235B)
- Gemma 3 QAT variants with improved quantization
- Command A (Cohere's 111B parameter model)
Performance improvements:
- 18% faster prompt processing on NVIDIA GPUs via improved KV cache management
- Flash Attention v2.7 integration for AMD ROCm 6.3+
- Apple Metal 3 optimizations for M4 Pro/Max/Ultra chips
- Reduced cold-start time by 400ms on average
API changes:
- New
/api/embedendpoint for batch embedding (up to 512 texts per call) - Structured output via
format: "json"now supports JSON Schema constraints - Streaming responses include token-level timing metadata
Bug fixes:
- Fixed memory leak in multi-model concurrent serving
- Resolved CUDA 12.6 compatibility issue on RTX 5090
- Fixed model import failing silently for GGUF files over 20GB
# Update to v0.6.2
# macOS (Homebrew)
brew upgrade ollama
# Linux
curl -fsSL https://ollama.com/install.sh | sh
# Windows — download installer from ollama.com/download
How to Update Ollama {#update-ollama}
macOS
# Homebrew (recommended)
brew update && brew upgrade ollama
# Verify update
ollama --version
# If using the .app bundle, it auto-updates on launch.
# Force a manual check:
open -a Ollama
# Click the menu bar icon → Check for Updates
Linux
# Official install script (always fetches latest)
curl -fsSL https://ollama.com/install.sh | sh
# If you installed via snap
sudo snap refresh ollama
# systemd users — restart the service after update
sudo systemctl restart ollama
Windows
# Download latest installer
# https://ollama.com/download/windows
# Or via winget
winget upgrade Ollama.Ollama
# Restart the Ollama service
net stop ollama && net start ollama
Docker
# Pull latest image
docker pull ollama/ollama:latest
# Stop and remove old container
docker stop ollama && docker rm ollama
# Start with new image
docker run -d --gpus all -v ollama:/root/.ollama \
-p 11434:11434 --name ollama ollama/ollama:latest
Important: Your models survive updates. Ollama stores models in ~/.ollama/models (Linux/macOS) or C:\Users\<you>\.ollama\models (Windows). Updating the binary does not delete them. Docker users should mount a volume as shown above.
Complete Version Timeline {#version-timeline}
2026 Releases
v0.6.2 — March 2026 Llama 4 support, batch embedding API, Flash Attention v2.7, M4 Metal 3 optimizations. See current release for full details.
v0.6.1 — February 2026
- DeepSeek R1 distilled variants support
- Fixed AMD ROCm 6.2 regression on RX 7900 XTX
- New
ollama cpcommand to duplicate models locally - Reduced disk usage during model pulls by ~30%
v0.6.0 — January 2026
- Major change: Switched default quantization from Q4_0 to Q4_K_M for new pulls
- Vision model support expanded (LLaVA 1.7, Qwen2-VL, InternVL2.5)
- Tool calling / function calling for compatible models
- Parallel request handling (up to 4 concurrent by default)
- ARM64 Windows builds (Snapdragon X Elite support)
2025 Releases
v0.5.x Series (July–December 2025)
- v0.5.11: NVIDIA RTX 5090 support (Day 1)
- v0.5.9: Structured JSON output via
formatparameter - v0.5.7: Speculative decoding for 2x faster generation on multi-GPU
- v0.5.4: AMD ROCm 6.1 full support, including RX 7600
- v0.5.0: Breaking change — API response format changed for
/api/chat. Thecontextfield was removed in favor of server-side session management.
v0.4.x Series (March–June 2025)
- v0.4.7: Apple M3 Ultra optimized Metal shaders
- v0.4.5:
ollama showcommand for model metadata inspection - v0.4.2: Multimodal model support (LLaVA, BakLLaVA)
- v0.4.0: Breaking change — Model storage format migrated from blob-based to content-addressable. First run after update triggers automatic migration (can take 5–15 minutes depending on model count).
v0.3.x Series (October 2024–February 2025)
- v0.3.12: GGUF v3 format support
- v0.3.9:
ollama createfrom Safetensors (no manual conversion needed) - v0.3.6: AMD ROCm 5.7 support, Radeon RX 7900 XTX validated
- v0.3.0: Breaking change — Modelfile syntax updated.
ADAPTERcommand replacedFROM ... ADAPTERpattern.
v0.2.x Series (May–September 2024)
- v0.2.8: GPU layer offloading with
num_gpuparameter - v0.2.5: OpenAI-compatible API endpoint at
/v1/chat/completions - v0.2.0: Custom model creation via Modelfiles. SYSTEM, TEMPLATE, and PARAMETER directives introduced.
v0.1.x Series (Initial Release–April 2024)
- v0.1.29: First Windows release
- v0.1.17: Added
ollama --versionflag - v0.1.0: Initial public release. macOS and Linux only. Supported GGUF models via llama.cpp backend.
Breaking Changes by Version {#breaking-changes}
These are the releases where something in your workflow might stop working after an update. I maintain this list because the official release notes sometimes bury breaking changes in minor bullet points.
| Version | What Broke | Migration |
|---|---|---|
| v0.6.0 | Default quantization changed to Q4_K_M | Existing models unaffected. New pulls use Q4_K_M. Force old behavior: ollama pull model:q4_0 |
| v0.5.0 | context field removed from chat API | Use conversation history instead of passing context tokens. See migration guide. |
| v0.4.0 | Model storage format migration | Automatic on first run. Back up ~/.ollama before updating if you have custom models. |
| v0.3.0 | Modelfile syntax change | Replace FROM base ADAPTER lora.gguf with separate FROM and ADAPTER lines. |
| v0.2.0 | CLI argument changes | --model flag replaced by positional argument. Old: ollama run --model llama2. New: ollama run llama2. |
If you run Ollama behind an application (like Open WebUI or Continue.dev), check that your client version supports the Ollama version you are upgrading to. Open WebUI v0.5+ works with Ollama v0.5+.
Version Comparison Table {#version-comparison}
This table covers features across major version milestones. Use it to decide what minimum version you need.
| Feature | v0.1 | v0.2 | v0.3 | v0.4 | v0.5 | v0.6 |
|---|---|---|---|---|---|---|
| macOS support | Yes | Yes | Yes | Yes | Yes | Yes |
| Linux support | Yes | Yes | Yes | Yes | Yes | Yes |
| Windows support | v0.1.29+ | Yes | Yes | Yes | Yes | Yes |
| Custom Modelfiles | No | Yes | Yes | Yes | Yes | Yes |
| GPU offloading | No | v0.2.8+ | Yes | Yes | Yes | Yes |
| OpenAI-compatible API | No | v0.2.5+ | Yes | Yes | Yes | Yes |
| AMD ROCm support | No | No | v0.3.6+ | Yes | Yes | Yes |
| Vision models | No | No | No | v0.4.2+ | Yes | Yes |
| Structured JSON output | No | No | No | No | v0.5.9+ | Yes |
| Tool / function calling | No | No | No | No | No | v0.6.0+ |
| Batch embedding API | No | No | No | No | No | v0.6.2+ |
| Concurrent requests | 1 | 1 | 1 | 1 | 2 | 4 |
| Max GGUF version | v1 | v2 | v3 | v3 | v3 | v3 |
How to Roll Back to a Previous Version {#rollback}
Sometimes an update breaks your workflow. Here is how to downgrade safely.
macOS (Homebrew)
# List available versions
brew search ollama
# Install a specific version (example: v0.5.11)
brew install ollama@0.5.11
# If that formula does not exist, install from the GitHub release:
curl -L https://github.com/ollama/ollama/releases/download/v0.5.11/Ollama-darwin.zip \
-o ~/Downloads/Ollama-0.5.11.zip
unzip ~/Downloads/Ollama-0.5.11.zip -d /Applications/
Linux
# Download a specific version binary
curl -L https://github.com/ollama/ollama/releases/download/v0.5.11/ollama-linux-amd64 \
-o /usr/local/bin/ollama
chmod +x /usr/local/bin/ollama
# Restart the service
sudo systemctl restart ollama
# Verify
ollama --version
Docker
# Use a specific tag instead of :latest
docker pull ollama/ollama:0.5.11
docker stop ollama && docker rm ollama
docker run -d --gpus all -v ollama:/root/.ollama \
-p 11434:11434 --name ollama ollama/ollama:0.5.11
Pinning a Version (Preventing Auto-Updates)
# macOS — pin the Homebrew formula
brew pin ollama
# Linux — hold the package if installed via apt
sudo apt-mark hold ollama
# Docker — always use a specific tag, never :latest
Warning: Rolling back from v0.4.0+ to v0.3.x requires restoring ~/.ollama from a backup made before the upgrade, because the storage format migration in v0.4.0 is one-way. Always back up before major version jumps.
Release Cadence and Roadmap {#release-cadence}
Ollama follows a roughly 2-week release cycle for minor versions and a 2-3 month cycle for major versions. The project does not publish a public roadmap, but based on GitHub issues and contributor discussions, here is what is likely coming:
Expected in v0.7 (estimated Q3 2026):
- Native support for GGUF v4 format
- Multi-node distributed inference (running one model across multiple machines)
- Built-in model quantization (convert FP16 to GGUF without external tools)
- Improved Windows ARM64 performance
Community-requested features with active PRs:
- Model sharding across CPU + GPU (partial offload improvements)
- Native LoRA adapter hot-swapping without rebuilding the model
- gRPC API alongside REST for lower-latency applications
You can track development at github.com/ollama/ollama/releases and the official Ollama website.
Troubleshooting Update Issues {#troubleshooting}
"ollama: command not found" after update
# macOS — Homebrew may have changed the symlink
brew unlink ollama && brew link ollama
# Linux — verify the binary path
which ollama
# If empty, re-run the install script
curl -fsSL https://ollama.com/install.sh | sh
# Windows — restart your terminal or reboot
# The installer adds Ollama to PATH, but existing terminals
# do not pick it up until reopened
Models disappear after update
Models should survive updates. If they are gone:
# Check if the models directory still exists
ls -la ~/.ollama/models/
# If it is empty, the update may have changed OLLAMA_MODELS path
# Check your environment
echo $OLLAMA_MODELS
# Re-pull missing models
ollama pull llama3.2
Service will not start after update
# Linux — check systemd logs
journalctl -u ollama -n 50
# macOS — check launchd logs
log show --predicate 'process == "ollama"' --last 5m
# Common fix: port conflict from old process
lsof -i :11434
kill -9 <PID>
ollama serve
CUDA / ROCm errors after update
# Verify your GPU driver version
nvidia-smi # NVIDIA
rocm-smi # AMD
# Ollama v0.6+ requires:
# NVIDIA: Driver 535+ (CUDA 12.2+)
# AMD: ROCm 6.0+
# If your driver is too old, either:
# 1. Update your GPU driver
# 2. Roll back Ollama to a version that supports your driver
If you hit issues not covered here, check the general Ollama troubleshooting guide or the project's GitHub issues page. For platform-specific installation help, see our Windows installation guide or Mac setup guide.
Staying Current Without Breaking Things
My recommendation: update monthly, not on release day. Let the community shake out bugs for a week or two before you upgrade. Pin your version in production environments and always back up ~/.ollama before major version jumps.
Set a reminder to check your version:
# Add to your .bashrc or .zshrc
alias ollama-check='echo "Installed: $(ollama --version)" && echo "Latest: check https://github.com/ollama/ollama/releases"'
Ollama's release velocity is a strength — they ship features fast and respond to bugs quickly. But that pace means every update deserves a quick test before you trust it with production workloads.
Running Ollama for the first time? Start with the complete Ollama guide for setup instructions, or compare the best Ollama models for your hardware.
Go from reading about AI to building with AI
10 structured courses. Hands-on projects. Runs on your machine. Start free.
Enjoyed this? There are 10 full courses waiting.
10 complete AI courses. From fundamentals to production. Everything runs on your hardware.
Build Real AI on Your Machine
RAG, agents, NLP, vision, MLOps — chapters across 10 courses that take you from reading about AI to building AI.
Want structured AI education?
10 courses, 160+ chapters, from $9. Understand AI, don't just use it.
Continue Your Local AI Journey
Comments (0)
No comments yet. Be the first to share your thoughts!