To run Aider fully local on Ollama, install Aider (the quickest way is python -m pip install aider-install then aider-install), pull a coding model with ollama pull qwen2.5-coder:14b, set OLLAMA_API_BASE=http://127.0.0.1:11434, then launch aider --model ollama_chat/qwen2.5-coder:14b inside a git repo. That gives you a free, private, git-native pair programmer in your terminal — no API key, no per-token bill, and every edit auto-committed so you can undo anything. Aider is the most mature terminal coding agent (Apache 2.0, ~41k GitHub stars), and unlike IDE-bound tools it works the same whether you use VS Code, Vim, or no editor at all.

This guide covers the exact install, the one easy-to-miss detail (use the ollama_chat/ prefix, not ollama/), which Ollama models actually code well, how Aider's architect/editor split and repo-map work, and how it compares to Cline and Goose.

Why Aider + Ollama instead of a cloud coding agent?

Aider is a command-line AI pair programmer that edits files in your local git repository. The Ollama pairing matters for three concrete reasons:

It is free and stays free. Cloud agents meter you per token; a local model on Ollama costs nothing per request after the one-time download. For a tool you leave running all day, that difference compounds fast.
Your code never leaves the machine. Aider sends file contents, a repo map, and chat history to the model. With Ollama, "the model" is a process on localhost — nothing goes to a third-party API. For proprietary or client code under NDA, that is the whole point.
It is git-native. Whenever Aider edits a file it commits the change with a descriptive message, so every AI edit is its own reviewable, revertible commit. You get a clean audit trail instead of a mystery diff.

The honest trade-off: a 7B-14B local model is not GPT-5 or Claude. It is genuinely good at focused edits, refactors, and boilerplate, and noticeably weaker than frontier cloud models on sprawling multi-file architecture. The architect/editor split below is how you close some of that gap. If you want the broader picture of building an all-local stack, see our complete 2026 local AI developer toolchain.

Reading articles is good. Building is better.

Free account = 20+ free chapters across 20 courses, with a per-chapter AI tutor. No card. Cancel anytime if you ever upgrade.

Start free in 30 seconds See pricing

How do you install Aider and connect it to Ollama?

You need two things running: Ollama (serving a model) and Aider (the agent). Assuming you already have Ollama installed, here is the full path.

1. Install Aider. The maintainers' quickest method is the bootstrap installer:

python -m pip install aider-install
aider-install

If you prefer a clean, isolated install (recommended on a dev box so Aider's dependencies don't collide with your project's), use uv:

uv tool install --force --python python3.12 --with pip aider-chat@latest

2. Pull a coding model with Ollama:

ollama pull qwen2.5-coder:14b

3. Point Aider at your local Ollama server:

export OLLAMA_API_BASE=http://127.0.0.1:11434   # Mac/Linux
# Windows (PowerShell/CMD): setx OLLAMA_API_BASE http://127.0.0.1:11434  (then restart the shell)

4. Launch Aider in a git repo:

cd your-project
aider --model ollama_chat/qwen2.5-coder:14b

That's it. Aider drops you into a chat prompt; describe a change in plain English and it edits the files and commits the result.

Use the ollama_chat/ prefix, not ollama/

This is the one detail people get wrong. Aider's docs explicitly recommend ollama_chat/<model> over ollama/<model> — the chat endpoint produces better results with Aider's prompting. So it is --model ollama_chat/qwen2.5-coder:14b, not --model ollama/qwen2.5-coder:14b.

Fix the context window (the silent quality killer)

By default Aider sizes Ollama's context window to fit each request plus about 8k tokens for the reply, which is fine for small edits. For real repo work you want a larger, fixed window. Create a .aider.model.settings.yml in your project root:

- name: ollama_chat/qwen2.5-coder:14b
  extra_params:
    num_ctx: 32768

Bump num_ctx as high as your VRAM allows — a bigger window lets Aider hold more of the repo map and more files in chat at once, which is where most of the quality comes from. Not sure what fits your card? Run the numbers through our VRAM calculator before you set it too high.

Which Ollama models work best with Aider?

Aider works with any Ollama model, but edit quality varies a lot. These are the local models worth running, all verified against their official model cards:

Model	Ollama tag	Size	HumanEval	VRAM (Q4_K_M)	Best for
Qwen2.5-Coder-14B-Instruct	qwen2.5-coder:14b	14.7B dense	89.6%	~9.5 GB	Best balance for one 12-16 GB GPU
Qwen2.5-Coder-32B-Instruct	qwen2.5-coder:32b	32B dense	92.7%	~19 GB	Highest quality if you have 24 GB
Qwen3-Coder-30B-A3B-Instruct	(community GGUF)	30.5B MoE / 3.3B active	agentic-focused	~18 GB	Long-context agentic coding (256K ctx)
DeepSeek-Coder-V2-Lite-Instruct	deepseek-coder-v2:16b	16B MoE / 2.4B active	~81% (vendor)	~10.5 GB	Fast first-token, 128K ctx
Qwen2.5-Coder-7B-Instruct	qwen2.5-coder:7b	7.6B dense	88.4%	~5 GB	Small GPUs / laptops

A few honest notes. Qwen2.5-Coder-14B is the default recommendation — 89.6% HumanEval at roughly 9.5 GB makes it the strongest model that comfortably fits a single 12 GB or 16 GB GPU, and it was trained for fill-in-the-middle so it edits cleanly. Step up to the 32B (92.7% HumanEval) only if you have a 24 GB card. Qwen3-Coder-30B-A3B is a Mixture-of-Experts model (30.5B total, ~3.3B active per token) tuned specifically for agentic coding with native 256K context — promising for Aider's longer sessions, but at launch it ships mainly as community GGUF quants rather than an official Ollama-library tag, so confirm the quant before relying on it. DeepSeek-Coder-V2-Lite is the speed pick: its 16B MoE activates only 2.4B params per token, so first tokens come back fast. (Its 236B big sibling hits 90.2% HumanEval but needs a server, not a desktop.) For the full cross-size leaderboard, see our guide to the best Ollama model for coding, and if you're choosing within the 12-16 GB bracket, our best 14B coding models breakdown ranks them by HumanEval and VRAM.

What is the architect/editor split, and why use it?

Aider has several chat modes you switch between mid-session: code (default, makes edits), ask (discuss without editing), architect, and help. The architect mode is the one that meaningfully improves results with local models.

In architect mode Aider uses two models in a two-pass design: an architect model reasons about the change and writes a plan in prose, then an editor model translates that plan into precise file edits in Aider's diff format. Splitting "think about the problem" from "produce a perfectly formatted diff" helps, because smaller local models often struggle to do both at once. Launch it with the --architect flag (or --chat-mode architect), and you can set a separate editor model:

aider --architect \
  --model ollama_chat/qwen2.5-coder:32b \
  --editor-model ollama_chat/qwen2.5-coder:14b

A practical pattern: stay in ask mode while you and the model agree on a plan, then say "go ahead" to execute — or reach for architect mode whenever a change touches more than a couple of files. You switch on the fly with the /code, /ask, /architect, and /help slash commands.

Reading articles is good. Building is better.

Free account = 20+ free chapters across 20 courses, with a per-chapter AI tutor. No card. Cancel anytime if you ever upgrade.

Start free in 30 seconds See pricing

How does the repo map work?

The repo map is what lets a small local model punch above its weight on a large codebase. Aider always sends the model three things: the files you've explicitly added to the chat (full contents), a repo map (a concise skeleton of every other file — the key classes and functions with their signatures), and the conversation history. The model can see how a file relates to the rest of the project without you pasting the entire repo into context.

That makes the most useful slash command /add: it pulls specific files into the chat so the model can edit them directly. The everyday command set:

Command	What it does
`/add path/to/file.py`	Add a file (or `/add src/*.py`, or a directory) to the chat for editing
`/drop file.py`	Remove a file from the chat once it's done
`/ask <question>`	Ask about the code without making edits
`/run <cmd>`	Run a shell command and optionally add its output to the chat
`/diff`	Show the diff of changes since the last message
`/undo`	Undo Aider's last commit
`/clear`	Discard the chat history for a fresh start

Because each edit is its own git commit, /undo is genuinely safe — it just reverts the last commit Aider made. If you've used Continue.dev with Ollama for inline autocomplete, think of Aider as the complementary tool: Continue lives in your editor for completions, Aider lives in the terminal for whole-task, multi-file edits.

Aider vs Cline vs Goose: which local agent fits you?

All three are free, open, and run on local Ollama models, but they live in different places and suit different workflows. Verified facts on each:

Tool	Where it runs	License	Model wiring	Best when
Aider	Terminal (CLI)	Apache 2.0	`--model ollama_chat/<model>` + OLLAMA_API_BASE	You want git-native, terminal-first, editor-agnostic edits
Cline	VS Code extension	Apache 2.0	Ollama provider in settings	You live in VS Code and want approve-every-step autonomy + browser/MCP tools
Goose	Desktop app + CLI	Apache 2.0	Ollama provider + MCP extensions	You want a standalone agent (not tied to an editor) with heavy MCP tooling

The short version: pick Aider if your workflow is terminal- and git-centric and you want the most mature, lowest-overhead option. Pick Cline if you live inside VS Code and want a human-in-the-loop agent that asks before every file change — see our Cline + Ollama setup. Pick Goose if you want a standalone desktop/CLI agent with a large MCP extension ecosystem — covered in Goose + Ollama. None of them locks you in; many developers keep Aider for fast terminal edits and a second tool for editor-integrated work.

A real terminal session (what it actually looks like)

Here is a representative session against a Python project, edited for length:

$ aider --model ollama_chat/qwen2.5-coder:14b
Aider v0.x | Model: ollama_chat/qwen2.5-coder:14b | Git repo: .git
> /add api/users.py
Added api/users.py to the chat
> add input validation to create_user and return 422 on bad email
Editing api/users.py
  - imports email-validator, validates payload.email
  - raises HTTP 422 with a clear message on failure
Committed: a3f10c2  feat: validate email in create_user, return 422
> /diff
(shows the committed diff)
> /undo
Removed last commit a3f10c2

Notice the loop: add the file, describe the change, Aider edits and commits, you inspect with /diff, and /undo reverts cleanly because it's all git underneath. No copy-paste, no leaving the terminal.

First-hand notes on speed and VRAM

Approximate, from a single machine — treat as ballpark, not a benchmark. On an RTX 3090 (24 GB) running qwen2.5-coder:14b at Q4_K_M, I saw roughly 35-45 tokens/sec generating edits, with the whole model GPU-offloaded; the 32B at the same quant dropped to about 18-22 tokens/sec but produced cleaner first-try diffs on multi-file refactors. The DeepSeek-Coder-V2-Lite MoE felt the snappiest on first-token latency, as expected from its 2.4B active params. The practical lesson held every time: the moment any layer spills from VRAM into system RAM, throughput collapses — keep the entire model on the GPU, and raise num_ctx only as far as your remaining VRAM allows. For a 12 GB card, the 14B at Q4 plus a 16K-32K context is the sweet spot; below that, drop to qwen2.5-coder:7b.

Key Takeaways

Aider + Ollama = a free, private, git-native coding agent in your terminal. No API key, no per-token cost, code never leaves localhost, and every edit is its own revertible commit.
Install fast, then wire it up: pip install aider-install && aider-install (or uv tool install ... aider-chat@latest), set OLLAMA_API_BASE=http://127.0.0.1:11434, and launch with aider --model ollama_chat/<model>.
Use the ollama_chat/ prefix, not ollama/ — it's the maintainers' recommendation and gives better edits.
Qwen2.5-Coder-14B is the default model pick (89.6% HumanEval, ~9.5 GB); raise num_ctx in .aider.model.settings.yml for real repo work.
Architect mode + the repo map are how a small local model handles big codebases: a reasoning pass plus a precise-edit pass, with a skeleton of the whole repo always in context.
Aider is the terminal/git-native pick; Cline is the VS Code pick; Goose is the standalone-agent pick — all three are free and run on local Ollama.

Next Steps

Verify everything against the source: the Aider GitHub repository and the official Aider + Ollama docs.
New to Ollama itself? Start with our complete Ollama guide, then come back here.
Choosing a model for your GPU? Compare the field in best Ollama model for coding and the 12-16 GB best 14B coding models ranking.
Prefer edits inside your editor? Set up inline completion with Continue.dev + Ollama, or pick a different agent in Cline + Ollama / Goose + Ollama.
Building the whole local stack? See the complete 2026 local AI developer toolchain.

Aider + Ollama Setup (2026): Free Local AI Coding Agent

Want to go deeper than this article?

Why Aider + Ollama instead of a cloud coding agent?

Reading articles is good. Building is better.

How do you install Aider and connect it to Ollama?

Use the ollama_chat/ prefix, not ollama/

Fix the context window (the silent quality killer)

Which Ollama models work best with Aider?

What is the architect/editor split, and why use it?

Reading articles is good. Building is better.

How does the repo map work?

Aider vs Cline vs Goose: which local agent fits you?

A real terminal session (what it actually looks like)

First-hand notes on speed and VRAM

Key Takeaways

Next Steps

Ollama’s running. Here’s what to build with it.

Liked this? 20 full AI courses are waiting.

Local AI Master Research Team

Build Real AI on Your Machine

Want structured AI education?

Continue Your Local AI Journey

How to Install Your First Local AI Model

How to Choose the Right AI Model for Your Computer

Comments (0)

Ready to Go Beyond Tutorials?

Go from reading about AI to building with AI

Related Guides

Cline + Ollama Setup

Goose + Ollama

Best Ollama Model for Coding

Written by the Local AI Master Team

Grab the AI Starter Kit — career roadmap, cheat sheet, setup guide

Ollama’s running. Here’s what to build with it.