★ Reading this for free? Get 20 structured AI courses + per-chapter AI tutor — the first chapter of every course free, no card.Start free in 30 seconds
AI Tools

Aider + Ollama Setup (2026): Free Local AI Coding Agent

June 20, 2026
12 min read
Local AI Master Research Team

Want to go deeper than this article?

Free account unlocks the first chapter of all 20 courses — RAG, agents, MCP, voice AI, MLOps, real GitHub repos.

📚AI Learning Path

Ollama’s running. Here’s what to build with it. Go from “ollama run” to RAG apps, agents, and fine-tuned models — structured and hands-on. First chapter free.

Start free
Or own it for life — Lifetime $149, pay once

To run Aider fully local on Ollama, install Aider (the quickest way is python -m pip install aider-install then aider-install), pull a coding model with ollama pull qwen2.5-coder:14b, set OLLAMA_API_BASE=http://127.0.0.1:11434, then launch aider --model ollama_chat/qwen2.5-coder:14b inside a git repo. That gives you a free, private, git-native pair programmer in your terminal — no API key, no per-token bill, and every edit auto-committed so you can undo anything. Aider is the most mature terminal coding agent (Apache 2.0, ~41k GitHub stars), and unlike IDE-bound tools it works the same whether you use VS Code, Vim, or no editor at all.

This guide covers the exact install, the one easy-to-miss detail (use the ollama_chat/ prefix, not ollama/), which Ollama models actually code well, how Aider's architect/editor split and repo-map work, and how it compares to Cline and Goose.

Why Aider + Ollama instead of a cloud coding agent?

Aider is a command-line AI pair programmer that edits files in your local git repository. The Ollama pairing matters for three concrete reasons:

  • It is free and stays free. Cloud agents meter you per token; a local model on Ollama costs nothing per request after the one-time download. For a tool you leave running all day, that difference compounds fast.
  • Your code never leaves the machine. Aider sends file contents, a repo map, and chat history to the model. With Ollama, "the model" is a process on localhost — nothing goes to a third-party API. For proprietary or client code under NDA, that is the whole point.
  • It is git-native. Whenever Aider edits a file it commits the change with a descriptive message, so every AI edit is its own reviewable, revertible commit. You get a clean audit trail instead of a mystery diff.

The honest trade-off: a 7B-14B local model is not GPT-5 or Claude. It is genuinely good at focused edits, refactors, and boilerplate, and noticeably weaker than frontier cloud models on sprawling multi-file architecture. The architect/editor split below is how you close some of that gap. If you want the broader picture of building an all-local stack, see our complete 2026 local AI developer toolchain.

Reading articles is good. Building is better.

Free account = 20+ free chapters across 20 courses, with a per-chapter AI tutor. No card. Cancel anytime if you ever upgrade.

How do you install Aider and connect it to Ollama?

You need two things running: Ollama (serving a model) and Aider (the agent). Assuming you already have Ollama installed, here is the full path.

1. Install Aider. The maintainers' quickest method is the bootstrap installer:

python -m pip install aider-install
aider-install

If you prefer a clean, isolated install (recommended on a dev box so Aider's dependencies don't collide with your project's), use uv:

uv tool install --force --python python3.12 --with pip aider-chat@latest

2. Pull a coding model with Ollama:

ollama pull qwen2.5-coder:14b

3. Point Aider at your local Ollama server:

export OLLAMA_API_BASE=http://127.0.0.1:11434   # Mac/Linux
# Windows (PowerShell/CMD): setx OLLAMA_API_BASE http://127.0.0.1:11434  (then restart the shell)

4. Launch Aider in a git repo:

cd your-project
aider --model ollama_chat/qwen2.5-coder:14b

That's it. Aider drops you into a chat prompt; describe a change in plain English and it edits the files and commits the result.

Use the ollama_chat/ prefix, not ollama/

This is the one detail people get wrong. Aider's docs explicitly recommend ollama_chat/<model> over ollama/<model> — the chat endpoint produces better results with Aider's prompting. So it is --model ollama_chat/qwen2.5-coder:14b, not --model ollama/qwen2.5-coder:14b.

Fix the context window (the silent quality killer)

By default Aider sizes Ollama's context window to fit each request plus about 8k tokens for the reply, which is fine for small edits. For real repo work you want a larger, fixed window. Create a .aider.model.settings.yml in your project root:

- name: ollama_chat/qwen2.5-coder:14b
  extra_params:
    num_ctx: 32768

Bump num_ctx as high as your VRAM allows — a bigger window lets Aider hold more of the repo map and more files in chat at once, which is where most of the quality comes from. Not sure what fits your card? Run the numbers through our VRAM calculator before you set it too high.

Which Ollama models work best with Aider?

Aider works with any Ollama model, but edit quality varies a lot. These are the local models worth running, all verified against their official model cards:

ModelOllama tagSizeHumanEvalVRAM (Q4_K_M)Best for
Qwen2.5-Coder-14B-Instructqwen2.5-coder:14b14.7B dense89.6%~9.5 GBBest balance for one 12-16 GB GPU
Qwen2.5-Coder-32B-Instructqwen2.5-coder:32b32B dense92.7%~19 GBHighest quality if you have 24 GB
Qwen3-Coder-30B-A3B-Instruct(community GGUF)30.5B MoE / 3.3B activeagentic-focused~18 GBLong-context agentic coding (256K ctx)
DeepSeek-Coder-V2-Lite-Instructdeepseek-coder-v2:16b16B MoE / 2.4B active~81% (vendor)~10.5 GBFast first-token, 128K ctx
Qwen2.5-Coder-7B-Instructqwen2.5-coder:7b7.6B dense88.4%~5 GBSmall GPUs / laptops

A few honest notes. Qwen2.5-Coder-14B is the default recommendation — 89.6% HumanEval at roughly 9.5 GB makes it the strongest model that comfortably fits a single 12 GB or 16 GB GPU, and it was trained for fill-in-the-middle so it edits cleanly. Step up to the 32B (92.7% HumanEval) only if you have a 24 GB card. Qwen3-Coder-30B-A3B is a Mixture-of-Experts model (30.5B total, ~3.3B active per token) tuned specifically for agentic coding with native 256K context — promising for Aider's longer sessions, but at launch it ships mainly as community GGUF quants rather than an official Ollama-library tag, so confirm the quant before relying on it. DeepSeek-Coder-V2-Lite is the speed pick: its 16B MoE activates only 2.4B params per token, so first tokens come back fast. (Its 236B big sibling hits 90.2% HumanEval but needs a server, not a desktop.) For the full cross-size leaderboard, see our guide to the best Ollama model for coding, and if you're choosing within the 12-16 GB bracket, our best 14B coding models breakdown ranks them by HumanEval and VRAM.

What is the architect/editor split, and why use it?

Aider has several chat modes you switch between mid-session: code (default, makes edits), ask (discuss without editing), architect, and help. The architect mode is the one that meaningfully improves results with local models.

In architect mode Aider uses two models in a two-pass design: an architect model reasons about the change and writes a plan in prose, then an editor model translates that plan into precise file edits in Aider's diff format. Splitting "think about the problem" from "produce a perfectly formatted diff" helps, because smaller local models often struggle to do both at once. Launch it with the --architect flag (or --chat-mode architect), and you can set a separate editor model:

aider --architect \
  --model ollama_chat/qwen2.5-coder:32b \
  --editor-model ollama_chat/qwen2.5-coder:14b

A practical pattern: stay in ask mode while you and the model agree on a plan, then say "go ahead" to execute — or reach for architect mode whenever a change touches more than a couple of files. You switch on the fly with the /code, /ask, /architect, and /help slash commands.

Reading articles is good. Building is better.

Free account = 20+ free chapters across 20 courses, with a per-chapter AI tutor. No card. Cancel anytime if you ever upgrade.

How does the repo map work?

The repo map is what lets a small local model punch above its weight on a large codebase. Aider always sends the model three things: the files you've explicitly added to the chat (full contents), a repo map (a concise skeleton of every other file — the key classes and functions with their signatures), and the conversation history. The model can see how a file relates to the rest of the project without you pasting the entire repo into context.

That makes the most useful slash command /add: it pulls specific files into the chat so the model can edit them directly. The everyday command set:

CommandWhat it does
/add path/to/file.pyAdd a file (or /add src/*.py, or a directory) to the chat for editing
/drop file.pyRemove a file from the chat once it's done
/ask <question>Ask about the code without making edits
/run <cmd>Run a shell command and optionally add its output to the chat
/diffShow the diff of changes since the last message
/undoUndo Aider's last commit
/clearDiscard the chat history for a fresh start

Because each edit is its own git commit, /undo is genuinely safe — it just reverts the last commit Aider made. If you've used Continue.dev with Ollama for inline autocomplete, think of Aider as the complementary tool: Continue lives in your editor for completions, Aider lives in the terminal for whole-task, multi-file edits.

Aider vs Cline vs Goose: which local agent fits you?

All three are free, open, and run on local Ollama models, but they live in different places and suit different workflows. Verified facts on each:

ToolWhere it runsLicenseModel wiringBest when
AiderTerminal (CLI)Apache 2.0--model ollama_chat/<model> + OLLAMA_API_BASEYou want git-native, terminal-first, editor-agnostic edits
ClineVS Code extensionApache 2.0Ollama provider in settingsYou live in VS Code and want approve-every-step autonomy + browser/MCP tools
GooseDesktop app + CLIApache 2.0Ollama provider + MCP extensionsYou want a standalone agent (not tied to an editor) with heavy MCP tooling

The short version: pick Aider if your workflow is terminal- and git-centric and you want the most mature, lowest-overhead option. Pick Cline if you live inside VS Code and want a human-in-the-loop agent that asks before every file change — see our Cline + Ollama setup. Pick Goose if you want a standalone desktop/CLI agent with a large MCP extension ecosystem — covered in Goose + Ollama. None of them locks you in; many developers keep Aider for fast terminal edits and a second tool for editor-integrated work.

A real terminal session (what it actually looks like)

Here is a representative session against a Python project, edited for length:

$ aider --model ollama_chat/qwen2.5-coder:14b
Aider v0.x | Model: ollama_chat/qwen2.5-coder:14b | Git repo: .git
> /add api/users.py
Added api/users.py to the chat
> add input validation to create_user and return 422 on bad email
Editing api/users.py
  - imports email-validator, validates payload.email
  - raises HTTP 422 with a clear message on failure
Committed: a3f10c2  feat: validate email in create_user, return 422
> /diff
(shows the committed diff)
> /undo
Removed last commit a3f10c2

Notice the loop: add the file, describe the change, Aider edits and commits, you inspect with /diff, and /undo reverts cleanly because it's all git underneath. No copy-paste, no leaving the terminal.

First-hand notes on speed and VRAM

Approximate, from a single machine — treat as ballpark, not a benchmark. On an RTX 3090 (24 GB) running qwen2.5-coder:14b at Q4_K_M, I saw roughly 35-45 tokens/sec generating edits, with the whole model GPU-offloaded; the 32B at the same quant dropped to about 18-22 tokens/sec but produced cleaner first-try diffs on multi-file refactors. The DeepSeek-Coder-V2-Lite MoE felt the snappiest on first-token latency, as expected from its 2.4B active params. The practical lesson held every time: the moment any layer spills from VRAM into system RAM, throughput collapses — keep the entire model on the GPU, and raise num_ctx only as far as your remaining VRAM allows. For a 12 GB card, the 14B at Q4 plus a 16K-32K context is the sweet spot; below that, drop to qwen2.5-coder:7b.

Key Takeaways

  1. Aider + Ollama = a free, private, git-native coding agent in your terminal. No API key, no per-token cost, code never leaves localhost, and every edit is its own revertible commit.
  2. Install fast, then wire it up: pip install aider-install && aider-install (or uv tool install ... aider-chat@latest), set OLLAMA_API_BASE=http://127.0.0.1:11434, and launch with aider --model ollama_chat/<model>.
  3. Use the ollama_chat/ prefix, not ollama/ — it's the maintainers' recommendation and gives better edits.
  4. Qwen2.5-Coder-14B is the default model pick (89.6% HumanEval, ~9.5 GB); raise num_ctx in .aider.model.settings.yml for real repo work.
  5. Architect mode + the repo map are how a small local model handles big codebases: a reasoning pass plus a precise-edit pass, with a skeleton of the whole repo always in context.
  6. Aider is the terminal/git-native pick; Cline is the VS Code pick; Goose is the standalone-agent pick — all three are free and run on local Ollama.

Next Steps

🎯
AI Learning Path

Ollama’s running. Here’s what to build with it.

Go from “ollama run” to RAG apps, agents, and fine-tuned models — structured and hands-on. First chapter free.

Or own it for life — Lifetime $149 $599, pay once

Liked this? 20 full AI courses are waiting.

From fundamentals to RAG, agents, MCP servers, voice AI, and production deployment with real GitHub repos. First chapter free, every course.

Reading now
Join the discussion

Local AI Master Research Team

Creator of Local AI Master. I've built datasets with over 77,000 examples and trained AI models from scratch. Now I help people achieve AI independence through local AI mastery.

Build Real AI on Your Machine

RAG, agents, NLP, vision, and MLOps - chapters across 20 courses that take you from reading about AI to building AI.

Want structured AI education?

20 courses, 495+ chapters, from $9. Understand AI, don't just use it.

AI Learning Path
More on AI Models for Coding
See the full Best Local AI for Coding guide.

Comments (0)

No comments yet. Be the first to share your thoughts!

📅 Published: June 20, 2026🔄 Last Updated: June 20, 2026✓ Manually Reviewed

Ready to Go Beyond Tutorials?

20 structured courses with hands-on chapters - build RAG chatbots, AI agents, and ML pipelines on your own hardware.

🎯
AI Learning Path

Go from reading about AI to building with AI

20 structured courses. Hands-on projects. Runs on your machine. Start free.

Or own it for life — Lifetime $149 $599, pay once

Was this helpful?

LM

Written by the Local AI Master Team

The team behind Local AI Master

We build Local AI Master around practical, testable local AI workflows: model selection, hardware planning, RAG systems, agents, and MLOps. The goal is to turn scattered tutorials into a structured learning path you can follow on your own hardware.

✓ Local AI Curriculum✓ Hands-On Projects✓ Open Source Contributor
📚
Free · no account required

Grab the AI Starter Kit — career roadmap, cheat sheet, setup guide

No spam. Unsubscribe with one click.

🎯
AI Learning Path

Ollama’s running. Here’s what to build with it.

Go from “ollama run” to RAG apps, agents, and fine-tuned models — structured and hands-on. First chapter free.

Or own it for life — Lifetime $149 $599, pay once
Free Tools & Calculators