What is n8n and can it use local AI models?

n8n is an open-source workflow automation platform (similar to Zapier or Make.com) that you self-host. Since version 1.25, n8n has a native Ollama integration node that connects directly to local AI models. This means you can build AI-powered automation workflows without paying for cloud AI APIs.

How much does n8n + Ollama cost to run?

The software is completely free (both are open source). The only ongoing cost is electricity for your server, typically $8-15/month for a dedicated workstation. Compare this to Zapier Professional ($89/month) plus OpenAI API costs ($60-80/month for moderate usage). Hardware investment for a capable used workstation is $300-500.

What hardware do I need for n8n + Ollama?

Minimum: 8GB RAM and a modern CPU (no GPU). This handles small models like Phi-3 Mini at ~5-10 tokens/second. Recommended: 16GB RAM and an NVIDIA GPU with 8-12GB VRAM (RTX 3060 or better). This runs 7B models at 30+ tokens/second, fast enough for real-time automation workflows.

Can n8n + Ollama replace Zapier + OpenAI?

For 80% of automation tasks (summarization, classification, data extraction, text formatting), yes. A local 7B model handles these reliably. For complex multi-step reasoning or creative tasks, GPT-4o still outperforms local models. The practical approach is to use n8n + Ollama for most workflows and route the 20% that need advanced reasoning through n8n's OpenAI node.

How many automations can n8n + Ollama handle per day?

On an RTX 3060 with Llama 3.2 3B, expect 700+ email summaries per hour, 350+ document extractions per hour, or 1,400+ text classifications per hour. With OLLAMA_NUM_PARALLEL=2, two workflows process simultaneously. There is no per-execution limit like Zapier imposes.

Is there a visual editor for building workflows?

Yes, n8n has a full visual drag-and-drop workflow editor. You connect nodes (triggers, AI models, actions, conditions) by drawing lines between them. No coding is required for most workflows, though a Code node is available for custom logic when needed.

Can I use n8n + Ollama for a customer support chatbot?

Yes. Use a Webhook trigger to receive questions, load your documentation as context, send it to Ollama for processing, and return the response. This guide includes a complete chatbot workflow example. For more sophisticated RAG with vector search, pair n8n with a tool like AnythingLLM or Flowise.

What triggers does n8n support?

n8n supports 400+ integrations including: email (IMAP/Gmail), Slack messages, GitHub events, webhooks (any HTTP POST), cron schedules, Google Sheets changes, Telegram messages, RSS feeds, file watchers, and database polling. Any of these can trigger an Ollama-powered AI workflow.

n8n + Ollama: Self-Hosted AI Automation for ~$12/mo

Published on April 10, 2026 — 20 min read

To run self-hosted AI automation, pair n8n (an open-source workflow engine) with Ollama (a local model runtime) in Docker and connect them through n8n's native Ollama node (added in n8n v1.25). As of June 2026 this runs unlimited AI workflows — summarization, classification, extraction — for roughly $12/month in electricity, versus $165-185/month for a Zapier + OpenAI stack. You need 8GB+ RAM (16GB and an NVIDIA GPU recommended for 7B-class models at 30+ tokens/second).

Most AI automation advice starts with "connect to the OpenAI API" and ends with a $500/month bill. There is a better path: run n8n (an open-source workflow engine) alongside Ollama (a local model runtime) on the same machine, and you get unlimited AI automation for the cost of electricity.

I replaced a Zapier Pro + OpenAI API setup that cost $170/month with n8n + Ollama running on a $300 used workstation. Same workflows. Same results. Zero recurring cost. The switch took an afternoon.

This guide shows you how to set up the stack and build three real workflows that handle actual work.

What you will build:

n8n + Ollama running together in Docker
Workflow 1: Automatic email summarizer (IMAP trigger → Ollama → Slack)
Workflow 2: Document processor (webhook → file parse → Ollama → database)
Workflow 3: Customer support chatbot (webhook → Ollama with context → response)
Cost comparison and migration strategy from cloud tools

Prerequisites:

A machine with Docker installed (Linux, macOS, or Windows with WSL2)
8GB+ RAM (16GB recommended for running 7B models alongside n8n)
Basic understanding of REST APIs and JSON

For local AI model setup, see the free local AI models guide. For AI agent patterns that pair well with n8n, check the AI agents local guide.

What is n8n and Why Pair It with Ollama
Docker Setup for n8n + Ollama
Connecting n8n to Ollama
Workflow 1: Email Summarizer
Workflow 2: Document Processor
Workflow 3: Support Chatbot
Triggers, Scheduling, and Webhooks
Cost Comparison: n8n + Ollama vs Cloud
Performance and Limitations
Troubleshooting

Reading articles is good. Building is better.

Free account = 20+ free chapters across 20 courses, with a per-chapter AI tutor. No card. Cancel anytime if you ever upgrade.

Start free in 30 seconds See pricing

What is n8n and Why Pair It with Ollama {#what-is-n8n}

n8n is an open-source workflow automation platform. Think Zapier or Make.com, but you host it yourself and there are no per-execution limits. It has a visual editor where you drag and drop nodes — triggers, actions, conditions, loops — to build automation workflows without writing code.

n8n ships with 400+ integrations: Gmail, Slack, PostgreSQL, HTTP webhooks, cron schedules, Google Sheets, Notion, and hundreds more. What makes it powerful for AI automation is its native Ollama node, added in n8n v1.25.

Why this combination works:

Problem with cloud AI automation	n8n + Ollama solution
OpenAI API costs $0.01-0.06 per 1K tokens	Local models: $0 per token
Zapier Pro: $49-89/mo for 2,000-5,000 tasks	n8n self-hosted: unlimited tasks
Data sent to third-party servers	All data stays on your machine
Rate limits during peak usage	No rate limits except your hardware
API key management and rotation	No API keys needed

The tradeoff: local models are slower than GPT-4o and less capable at complex reasoning. For 80% of automation tasks — summarization, classification, extraction, reformatting — a local 7B or 13B model handles it fine. The 20% where you genuinely need GPT-4-level intelligence can still use a cloud API through n8n's OpenAI node.

Docker Setup for n8n + Ollama {#docker-setup}

Docker Compose File

mkdir -p ~/n8n-ollama && cd ~/n8n-ollama

# docker-compose.yml
version: "3.8"

services:
  ollama:
    image: ollama/ollama:latest
    container_name: ollama
    restart: unless-stopped
    ports:
      - "11434:11434"
    volumes:
      - ollama_data:/root/.ollama
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]
    environment:
      - OLLAMA_NUM_PARALLEL=2
      - OLLAMA_FLASH_ATTENTION=1

  n8n:
    image: docker.n8n.io/n8nio/n8n:latest
    container_name: n8n
    restart: unless-stopped
    ports:
      - "5678:5678"
    volumes:
      - n8n_data:/home/node/.n8n
    environment:
      - N8N_HOST=0.0.0.0
      - N8N_PORT=5678
      - N8N_PROTOCOL=http
      - WEBHOOK_URL=http://localhost:5678
      - N8N_DIAGNOSTICS_ENABLED=false
      - N8N_HIRING_BANNER_ENABLED=false
    depends_on:
      - ollama

volumes:
  ollama_data:
  n8n_data:

Launch and Pull Models

# Start both services
docker compose up -d

# Wait 10 seconds for Ollama to initialize, then pull models
docker exec ollama ollama pull llama3.2
docker exec ollama ollama pull qwen3:8b

# Verify both are running
docker compose ps

# n8n is at http://localhost:5678
# Ollama API is at http://localhost:11434

CPU-Only Setup

If you do not have an NVIDIA GPU, remove the deploy block from the Ollama service. Pull a smaller model:

docker exec ollama ollama pull phi3:mini
docker exec ollama ollama pull gemma:2b

Expect ~5-10 tokens/second on CPU for a 3B model. That is fast enough for background automation tasks where response time is not critical.

Connecting n8n to Ollama {#connecting}

Step 1: Create Ollama Credentials in n8n

Open n8n at http://localhost:5678
Create your admin account (first-time setup)
Go to Settings → Credentials → Add Credential
Search for Ollama
Set the Base URL to http://ollama:11434
- Use ollama (the Docker service name), not localhost
Click Save

Step 2: Test the Connection

Create a quick test workflow:

Click Add Workflow → Add first step
Add a Manual Trigger node (just to test)
Add an Ollama Chat Model node
Configure it:
- Credential: select the Ollama credential you created
- Model: llama3.2
Add a Basic LLM Chain node
Connect: Manual Trigger → Basic LLM Chain (with Ollama Chat Model as the AI model)
Set the prompt to: Summarize in one sentence: The quick brown fox jumped over the lazy dog.
Click Test Workflow

If you get a response, the connection works. If not, check the troubleshooting section.

Reading articles is good. Building is better.

Free account = 20+ free chapters across 20 courses, with a per-chapter AI tutor. No card. Cancel anytime if you ever upgrade.

Start free in 30 seconds See pricing

Workflow 1: Email Summarizer {#email-summarizer}

This workflow checks your inbox every 5 minutes, summarizes new emails with Ollama, and posts the summaries to Slack. Actual time savings: ~15 minutes per day if you get 30+ emails.

Workflow Structure

[IMAP Trigger] → [Filter: skip newsletters] → [Ollama: summarize] → [Slack: post to channel]

Node Configuration

1. IMAP Email Trigger

Mailbox: INBOX
Poll interval: 5 minutes
Credential: Your email (Gmail, Outlook, or any IMAP server)

2. IF Node (Filter)

Condition: {{ $json.from }} does not contain "newsletter" AND does not contain "noreply"
This skips marketing emails and only processes real messages

3. Ollama Chat Model + Basic LLM Chain

Model: llama3.2 (fast enough for summarization)
System prompt:

You are an email summarizer. For each email, produce:
1. SENDER: who sent it
2. URGENCY: high/medium/low
3. SUMMARY: 2-3 sentences max
4. ACTION NEEDED: yes/no, and what action

Be concise. No filler.

User prompt: Summarize this email:\nFrom: {{ $json.from }}\nSubject: {{ $json.subject }}\nBody: {{ $json.text.substring(0, 3000) }}

4. Slack Node

Channel: #email-summaries
Message: *{{ $('IMAP').item.json.subject }}*\n{{ $json.text }}

Performance

With Llama 3.2 3B on an RTX 3060 (12GB), each email takes 3-8 seconds to summarize. A batch of 10 emails processes in under a minute. On CPU, expect 15-30 seconds per email.

Workflow 2: Document Processor {#document-processor}

This workflow accepts PDF uploads via webhook, extracts text, chunks it, sends each chunk to Ollama for analysis, and stores structured output in a database.

Workflow Structure

[Webhook: POST /process] → [Extract PDF text] → [Split into chunks] → [Ollama: extract data] → [PostgreSQL: insert]

Node Configuration

1. Webhook Node

Method: POST
Path: /process
Response mode: Last node (returns result to caller)

2. Extract from File Node

Operation: Extract text from PDF
Input: Binary data from webhook

3. Code Node (Text Splitter)

const text = $input.first().json.data;
const chunkSize = 2000;
const overlap = 200;
const chunks = [];

for (let i = 0; i < text.length; i += chunkSize - overlap) {
  chunks.push({
    json: {
      chunk: text.substring(i, i + chunkSize),
      index: chunks.length,
      total: Math.ceil(text.length / (chunkSize - overlap))
    }
  });
}

return chunks;

4. Ollama Chat Model + Basic LLM Chain

Model: qwen3:8b (good at structured extraction)
System prompt:

Extract structured data from this document chunk. Return JSON:
{
  "entities": ["list of people, companies, products mentioned"],
  "dates": ["any dates found"],
  "amounts": ["any monetary amounts"],
  "key_facts": ["2-3 important facts"],
  "category": "one of: legal, financial, technical, correspondence, other"
}
Return ONLY valid JSON. No explanation.

User prompt: {{ $json.chunk }}

5. PostgreSQL Node

Operation: Insert
Table: document_extractions
Columns: chunk_index, entities, dates, amounts, key_facts, category, processed_at

Triggering the Workflow

# Upload a PDF for processing
curl -X POST http://localhost:5678/webhook/process \
  -F "file=@contract.pdf"

Workflow 3: Support Chatbot {#support-chatbot}

A webhook-based chatbot that answers questions using your documentation. This is a lightweight RAG setup without a vector database — suitable for small knowledge bases (under 50 pages).

Workflow Structure

[Webhook: POST /chat] → [Load context docs] → [Build prompt] → [Ollama: answer] → [Respond to webhook]

Node Configuration

1. Webhook Node

Path: /chat
Method: POST
Expected body: { "question": "How do I reset my password?" }

2. Read Binary Files Node

Read from: /home/node/.n8n/knowledge-base/
Pattern: *.txt
This loads your documentation files as context

3. Code Node (Build Prompt)

const question = $('Webhook').first().json.body.question;
const docs = $input.all().map(item => item.json.data).join('\n---\n');

return [{
  json: {
    prompt: `Answer the user's question using ONLY the context below. If the context doesn't contain the answer, say "I don't have information about that."

CONTEXT:
{docs.substring(0, 6000)}

QUESTION: {question}

ANSWER:`
  }
}];

4. Ollama Chat Model + Basic LLM Chain

Model: llama3.2
Temperature: 0.3 (lower = more factual, less creative)
User prompt: {{ $json.prompt }}

5. Respond to Webhook Node

Response body: { "answer": "{{ $json.text }}" }

Testing

# Ask a question
curl -X POST http://localhost:5678/webhook/chat \
  -H "Content-Type: application/json" \
  -d '{"question": "What are your business hours?"}'

For a more sophisticated RAG setup with vector search and embedding, see the RAG local setup guide.

Triggers, Scheduling, and Webhooks {#triggers}

n8n supports multiple ways to start a workflow:

Cron / Schedule Trigger

Every 5 minutes: */5 * * * *
Every hour: 0 * * * *
Daily at 9 AM: 0 9 * * *
Weekdays at 8 AM: 0 8 * * 1-5

Webhook Trigger

Any external service can POST to your n8n webhook URL. Useful for:

GitHub push events → AI code review
Stripe payment events → AI receipt generation
Form submissions → AI classification and routing

App-Specific Triggers

n8n has native triggers for:

Gmail / IMAP: New email received
Slack: New message in channel
GitHub: Pull request opened, issue created
Google Sheets: Row added or updated
Telegram: New message to bot
RSS: New feed item

Polling Triggers

For services without webhooks, n8n polls on a schedule:

Check an API every N minutes
Watch a folder for new files
Monitor a database table for new rows

See the full integration list at n8n.io/integrations.

Cost Comparison: n8n + Ollama vs Cloud {#cost-comparison}

Real numbers from my migration. These are monthly costs for a small business running 5,000 AI-powered automations per month.

Cloud Stack (Before)

Service	Monthly Cost
Zapier Professional (5,000 tasks)	$89
OpenAI API (~2M tokens/month)	$60-80
Make.com for backup workflows	$16
Total	$165-185/mo

Self-Hosted Stack (After)

Item	Monthly Cost
Electricity (dedicated workstation)	~$12
n8n software	$0 (open source)
Ollama + models	$0 (open source)
Total	~$12/mo

Hardware investment: I bought a used Dell Precision T5820 with a Xeon W-2145, 64GB RAM, and an RTX 3060 12GB for $300 on eBay. At the $165/month cloud savings, it paid for itself in under 2 months.

Where Cloud Still Wins

Be honest about the limitations:

GPT-4o-level reasoning: If your workflow requires complex multi-step reasoning, creative writing, or nuanced understanding, GPT-4o still outperforms local 7B-13B models significantly.
Zero maintenance: Cloud services handle uptime, scaling, and updates. Self-hosting means you are the ops team.
First 10 minutes of setup: Zapier + OpenAI works in 10 minutes. This guide takes 30-60 minutes.

My approach: use n8n + Ollama for the 80% of workflows that involve straightforward tasks (summarization, classification, extraction, formatting). Route the remaining 20% through n8n's OpenAI node when you genuinely need it.

Performance and Limitations {#limitations}

Throughput Benchmarks

Tested on RTX 3060 12GB with Llama 3.2 3B (Q4_K_M):

Task	Tokens/sec	Time per item	Items/hour
Email summary (300 words in, 100 out)	32 tok/s	4-6 seconds	~700
Document extraction (2K chunk)	32 tok/s	8-12 seconds	~350
Classification (short text)	32 tok/s	2-3 seconds	~1,400
Chatbot response (with context)	32 tok/s	6-10 seconds	~450

Bottlenecks

Model loading: First request after idle takes 5-15 seconds while the model loads into GPU memory. Set OLLAMA_KEEP_ALIVE=24h to keep models loaded.
Concurrency: With OLLAMA_NUM_PARALLEL=2, two workflows can process simultaneously. More parallel requests cause queueing. A 24GB GPU can handle NUM_PARALLEL=4 comfortably.
Context length: Most local models max out at 8K-32K tokens. If your document chunks are too large, the model truncates or produces garbage at the end. Keep prompts under 4K tokens for consistent results.
No streaming in workflows: Unlike a chatbot interface, n8n waits for the complete Ollama response before passing it to the next node. This means the total workflow time includes full generation time.

Setting OLLAMA_KEEP_ALIVE

# In docker-compose.yml, under ollama environment:
environment:
  - OLLAMA_KEEP_ALIVE=24h    # Keep model loaded for 24 hours
  - OLLAMA_NUM_PARALLEL=2

This eliminates cold-start latency at the cost of keeping GPU memory occupied.

Troubleshooting {#troubleshooting}

n8n cannot find the Ollama credential type

You need n8n v1.25 or newer. Check your version:

docker exec n8n n8n --version
# If below 1.25, update:
docker compose pull n8n
docker compose up -d n8n

"Connection refused" when n8n connects to Ollama

# The Ollama URL in n8n must use the Docker service name, not localhost
# Correct: http://ollama:11434
# Wrong:   http://localhost:11434

# Test from inside the n8n container
docker exec n8n curl -s http://ollama:11434/api/version

Ollama returns empty or garbled responses

# Check if the model is actually loaded
docker exec ollama ollama list

# Test the model directly
docker exec ollama ollama run llama3.2 "Say hello"

# If it works directly but not through n8n, the prompt may be too long
# Reduce chunk sizes or use a model with larger context window

n8n workflow times out

Default timeout for HTTP nodes in n8n is 60 seconds. Ollama can take longer for large prompts on CPU.

# Increase n8n timeout
environment:
  - N8N_DEFAULT_TIMEOUT=300

High memory usage

# Check what is consuming memory
docker stats

# n8n typically uses 200-500MB
# Ollama uses 4-12GB depending on the loaded model

# If running out of memory, use a smaller model
docker exec ollama ollama pull phi3:mini  # Only ~2GB in VRAM

Workflows stop running after restart

Make sure n8n workflows are set to Active (the toggle in the top-right of the workflow editor). Only active workflows run automatically. Manual triggers require you to click "Test" each time.

Migration from Zapier / Make.com

If you are currently using cloud automation tools, here is a practical migration path:

Audit your workflows: List every Zap or Scenario. Tag each as "simple AI task" or "needs GPT-4."
Recreate in n8n: Start with the highest-volume simple workflows. n8n has import guides for Zapier workflows.
Test side-by-side: Run both for a week. Compare output quality.
Cut over gradually: Disable cloud workflows one at a time as you validate the n8n replacements.
Keep a cloud fallback: Keep one OpenAI API node in n8n for tasks that genuinely need GPT-4o. Most workflows will not need it.

Building more complex AI agents? See the AI agents local guide for multi-step reasoning patterns. For a visual flow builder alternative, check out Flowise + Ollama.

n8n + Ollama: Self-Hosted AI Automation Guide

Want to go deeper than this article?

Table of Contents

Reading articles is good. Building is better.

What is n8n and Why Pair It with Ollama {#what-is-n8n}

Docker Setup for n8n + Ollama {#docker-setup}

Docker Compose File

Launch and Pull Models

CPU-Only Setup

Connecting n8n to Ollama {#connecting}

Step 1: Create Ollama Credentials in n8n

Step 2: Test the Connection

Reading articles is good. Building is better.

Workflow 1: Email Summarizer {#email-summarizer}

Workflow Structure

Node Configuration

Performance

Workflow 2: Document Processor {#document-processor}

Workflow Structure

Node Configuration

Triggering the Workflow

Workflow 3: Support Chatbot {#support-chatbot}

Workflow Structure

Node Configuration

Testing

Triggers, Scheduling, and Webhooks {#triggers}

Cron / Schedule Trigger

Webhook Trigger

App-Specific Triggers

Polling Triggers

Cost Comparison: n8n + Ollama vs Cloud {#cost-comparison}

Cloud Stack (Before)

Self-Hosted Stack (After)

Where Cloud Still Wins

Performance and Limitations {#limitations}

Throughput Benchmarks

Bottlenecks

Setting OLLAMA_KEEP_ALIVE

Troubleshooting {#troubleshooting}

n8n cannot find the Ollama credential type

"Connection refused" when n8n connects to Ollama

Ollama returns empty or garbled responses

n8n workflow times out

High memory usage

Workflows stop running after restart

Migration from Zapier / Make.com

Ollama’s running. Here’s what to build with it.

Liked this? 20 full AI courses are waiting.

Local AI Master Research Team

Build Real AI on Your Machine

Want structured AI education?

Continue Your Local AI Journey

How to Install Your First Local AI Model

How to Choose the Right AI Model for Your Computer

Comments (0)

Written by the Local AI Master Team

AI Automation Recipes Weekly

Build Real AI on Your Machine

🎓 Continue Learning

Related Guides

AI Agents: Complete Local Setup Guide

Ollama + Open WebUI Docker Setup

Free Local AI Models: Top Picks

Flowise + Ollama: Visual AI Builder

Grab the AI Starter Kit — career roadmap, cheat sheet, setup guide

Ollama’s running. Here’s what to build with it.