Starter Kit · One-Time Purchase

Ollama Docker Templates

10 one-command Docker Compose stacks for local AI

$19$5one-time · yours to keep

Worth $19 — you pay $5 today.

Buy now — $5

Instant delivery — download from your in-app library right after checkout. Linked to your account; sign in any time to re-download. All sales final.

Overview

Ollama Docker Templates is a set of 13 production-leaning, security-hardened Docker Compose stacks for running local AI with Ollama. Each one stands up a real piece of private AI infrastructure with a couple of commands: a ChatGPT-style chat UI, a working document-Q&A (RAG) API with source citations, an OpenAI-compatible gateway, workflow automation, observability, and more. Every image tag is pinned (no surprise `:latest` breakage), secrets are required not defaulted, services bind to localhost by default, and each stack ships an `.env.example` plus a one-line GPU override.

The real value isn't the Compose files themselves — those are a commodity. The value is what they let you sell: private, on-premise AI for businesses that legally or contractually can't send their data to OpenAI, Anthropic, or Google. Law firms, clinics, accountants, and agencies under NDA all have this problem, and most agencies only know how to plug into the cloud — which is exactly the thing these clients aren't allowed to do.

This kit removes the build barrier. You can stand up a working private RAG chatbot in an afternoon instead of a week. It does not remove the work — discovery, the client's messy documents, hardware sizing, security sign-off, staff training, and support are still yours. There's no passive income here. There is a real, fundable service with very little local competition, and the included MONETIZE.md hands you the pitch, pricing, and a fill-in-the-blanks proposal to sell it.

What's included

  • 13 ready-to-run Docker Compose stacks, each with docker-compose.yml, an .env.example, and (for 12 of them) a one-line docker-compose.gpu.yml NVIDIA override
  • The flagship RAG stack: a real, working 312-line Python/FastAPI document Q&A service (Ollama + ChromaDB) with /ingest, /ask, /stats, /reset endpoints that returns answers WITH source citations and similarity scores
  • Open WebUI stack — a ChatGPT-style web UI for local models, the fastest path to a working demo
  • LiteLLM gateway stack — an OpenAI-compatible API in front of Ollama so existing client tools/scripts 'call OpenAI' but actually hit local models, with per-team keys and budgets
  • Qdrant RAG stack — a faster, scalable vector-store alternative for larger private knowledge bases
  • SearXNG web-search stack — private, web-augmented chat (a self-hosted Perplexity-style setup)
  • n8n and Flowise stacks for local-LLM workflow automation and visual agent/chain building
  • Langfuse v3 observability stack (web + worker + Postgres + ClickHouse + Redis + MinIO) for tracing and measuring prompt quality
  • Jupyter, VS Code (code-server), LibreChat, Dify-connector, and dual-GPU nginx-load-balanced multi-GPU stacks
  • Helper scripts: list-stacks.sh, start-stack.sh (with --gpu / --build flags), stop-stack.sh (with --volumes)
  • Security defaults baked in: pinned image tags, required secrets via ${VAR:?...}, localhost binding, memory limits + no-new-privileges, RAG API that fails closed if the key is unset
  • docs/SECURITY.md and docs/TROUBLESHOOTING.md, plus current verified-June-2026 model recommendations (qwen3:8b, llama3.3:70b, nomic-embed-text, and more)
  • MONETIZE.md — the full service playbook: the exact pitch language, buyer segments, realistic 2026 pricing table, a copy-paste Statement of Work, six no-cold-outreach client channels, and a delivery/handoff checklist

Who it's for

  • Freelancers and small dev shops who want to sell private, on-prem AI setups but don't want to spend a week building the infra from scratch
  • IT consultants, MSPs, and fractional CTOs already serving regulated clients (law, medical, accounting) who get asked 'what about AI?' and need a compliant answer
  • Developers comfortable with Docker who want a proven, hardened starting point instead of stitching together random GitHub gists
  • Agencies under NDA / no-third-party-AI clauses who need an internal assistant they can actually show clients
  • Solo technical founders validating a private-AI service before committing to a heavier build

Use cases

  • Build a private document Q&A assistant over a client's case files, contracts, SOPs, or handbooks — answers grounded in their docs, with citations, nothing sent to the cloud
  • Deploy a ChatGPT-style team chatbot on a client's own hardware so staff stop secretly pasting sensitive text into public ChatGPT ('shadow AI' is the pain you're fixing)
  • Drop an OpenAI-compatible gateway in front of local models so a client's existing scripts and tools keep working without cloud API bills
  • Add local-LLM steps to a client's automations: summarize inbound email, classify support tickets, draft replies — all on-prem via n8n or Flowise
  • Stand up a paid pilot/proof-of-concept in an afternoon to win the full build
  • Self-host a private Perplexity-style web-augmented chat, or run local AI notebooks / a browser IDE with a local model backend
Turn it into income

Sell 'AI you're allowed to use' to businesses that can't legally touch the cloud

The service

Private, on-premise AI setup and support: a private RAG document-Q&A assistant or team chatbot that runs entirely on the client's own hardware, with no data sent to any third-party AI service. You deliver the build (provision + harden the server, ingest their documents, configure auth and backups, give them a one-page 'no external AI calls' compliance doc, and train their staff), then keep the relationship on a maintenance retainer.

What to charge

Paid pilot / proof-of-concept: $1,500–$3,000. Production private RAG or team chatbot (server, auth, backups, training): $3,500–$8,000 one-time + $200–$800/mo retainer. Multi-stack rollout (chatbot + RAG + gateway + an automation): $8,000–$20,000 + $500–$2,000/mo. Maintenance-only: $150–$600/mo. Hardware is the client’s cost — spec it and pass it through, or mark up 10–20%. Quote fixed price for fixed scope; bill the outcome, not hours.

How to find clients

  • Write the one hyper-specific post that ranks: 'Private ChatGPT alternative for [law firms / dental offices / accountants] that keeps client data in-house.' People in regulated fields literally search this — one honest, niche article beats 100 generic ones
  • Partner with people who already serve the niche — local IT/MSP shops, bookkeepers, fractional CTOs, compliance consultants. They have the clients and get asked 'what about AI?' Offer a 10–20% referral cut or white-label the build. One good MSP relationship can feed you steadily (the single highest-yield channel)
  • Record a 5-minute screen demo of the RAG stack answering questions over a sample 'company handbook' with the browser network tab open showing zero external calls. The 'watch it never call the cloud' moment is the whole sell — post it on YouTube/LinkedIn titled for one niche
  • Speak where they gather: a 20-minute 'AI without the compliance risk' talk at a bar-association lunch, dental study club, chamber-of-commerce event, or regional accounting meetup. You'll be the only one who can say 'and none of it leaves your office'
  • Productize a tripwire: a flat-fee 'Private AI Readiness Assessment' ($300–$750) where you review their documents and constraints and deliver a short recommendation plus a hardware spec. Low-risk for them, near-automatic lead into the full build

The delivery steps

  1. Run discovery: confirm the real compliance constraint that rules out cloud AI, pick the single most valuable use case, get a representative sample of their documents, and size hardware to the model (agree who buys it)
  2. Send a fixed-price Statement of Work (use the included SOW template) — paid pilot first, 50% to start / 50% on delivery, with the maintenance retainer attached
  3. Build the stack: cp .env.example .env, generate every secret with openssl rand -hex 32, pull a current model + embedding model, bring the stack up, and confirm healthchecks are green
  4. Ingest their real documents, spot-check that answers and citations are correct, enable auth, lock down registration, keep Ollama's 11434 off the public internet, and put TLS in front if it's accessed beyond one machine
  5. Configure and test-restore automated backups of the volumes so you’ve proven recovery before handoff
  6. Hand off: deliver the one-page 'no external AI calls' compliance doc, run a 90-minute staff training plus a written quickstart, document how to update models/stack, and get the maintenance agreement signed (or the decline on record)

How to market it

  • Lead with the compliance angle, never the tech. Your headline is 'your client data never leaves the building,' not 'I deploy Docker.' Sell only to people who genuinely need that edge — don't try to out-feature OpenAI
  • Niche hard in your content. A page titled for ONE regulated vertical ('private AI for dental practices') outranks and out-converts generic 'local AI' content because the searcher's intent is exact
  • Use the live-demo proof. A short screen recording showing the network tab with zero outbound calls while it answers questions over private docs is more convincing than any slide deck — repurpose one demo across LinkedIn, YouTube, and your sales page
  • Build referral relationships with adjacent service providers (MSPs, compliance consultants, bookkeepers) — they already have the trust and the clients; you supply the capability and a cut
  • Answer the question publicly where it's already being asked: niche LinkedIn, industry Slack/Discord, and forums where 'can we use AI with client data?' comes up. Be the genuinely helpful expert, not a pitch
  • Sell a low-risk tripwire (Readiness Assessment) as your top-of-funnel offer instead of asking for a big commitment up front — it filters tire-kickers and converts to builds at a high rate

Frequently asked questions

Do I need to be a Docker expert to use this?

You need to be comfortable on a command line and able to run `docker compose up`. Each stack ships with an .env.example, helper scripts, and a README, and the docs include a TROUBLESHOOTING guide. If you can clone a repo and set an environment variable, you can stand up a stack. The harder work — client discovery, document wrangling, and support — is the part you’re actually paid for, and the MONETIZE.md playbook walks you through it.

Is this just GitHub files I could find for free?

The Compose files are a commodity and the README even says so — the value is the hardening (pinned tags, required secrets, localhost binding, fail-closed RAG API), the real working 312-line RAG service with citations, and especially the MONETIZE.md playbook: the exact pitch, realistic 2026 pricing, a fill-in-the-blanks Statement of Work, six no-cold-outreach client channels, and a delivery checklist. You’re buying the assembled, sell-ready package and the business model, not just YAML.

Will I really make money with this, or is it hype?

There’s no passive income and no guarantee — that’s stated plainly in the materials. The kit removes the build barrier so you can deliver a private RAG setup in an afternoon instead of a week, but you still do discovery, scoping, support, and sales. The earning potential is real freelance-rate work ($1,500–$8,000 builds + retainers) selling to clients who genuinely can’t use cloud AI. Whether you earn depends on you finding and serving those clients.

What hardware do my clients need?

It depends on the model. A small model like qwen3:8b runs comfortably on 16 GB RAM; larger models like llama3.3:70b want a GPU or a lot of RAM/VRAM. A ~$1,500 mini-PC runs the 8B-class models well; bigger workloads need a workstation or GPU box ($2,000–$6,000). Part of your job (and the delivery checklist) is sizing hardware to the use case honestly — and hardware is the client’s cost, not yours to float.

Are local models as good as ChatGPT / GPT-5?

No, and you should never claim they are — the kit's honesty guardrails are explicit about this. Local models are strong but not frontier-cloud level, and RAG reduces hallucination and cites sources but doesn't eliminate the need for human review. Your edge is not 'better than OpenAI,' it's 'your data stays home.' Sell to clients who need that edge, show real outputs before they pay, and set expectations in writing.

What exactly do I get when I download it?

A single zip with all 13 hardened stacks (each with docker-compose.yml, .env.example, and 12 with a GPU override), the working FastAPI RAG application code, helper scripts (list/start/stop), docs/SECURITY.md and docs/TROUBLESHOOTING.md, the verified-June-2026 model recommendations, and MONETIZE.md with the full service playbook, pricing, and proposal template. It’s a one-time download — yours to keep and adapt for client work under the MIT license.

Ready to get started?

One-time purchase · instant download · yours to keep.

Buy now — $5

After you buy

Purchases are linked to your account — sign in and head to your product library to download anytime. Bought without an account? Check your email for the download link and a one-click way to set a password.

← Back to all kits, tools & codebases
Free Tools & Calculators