Starter Kit · One-Time Purchase
AI Agent Starter Kit
Ready-to-adapt AI agents with native tool calling
Worth $49 — you pay $19 today.
Buy now — $19Instant delivery — download from your in-app library right after checkout. Linked to your account; sign in any time to re-download. All sales final.
Overview
The AI Agent Starter Kit is a working Python codebase for building advanced AI agents that run 100% locally on Ollama — no API keys, no cloud, no data ever leaving the machine. It ships with five runnable agents (Research, Code Review, Data Analysis, Knowledge Base/RAG, and a multi-agent Orchestrator), a reusable core runtime that does native Ollama tool-calling, conversation memory, structured JSON output, a multi-step planner, and multi-agent handoff — plus Model Context Protocol (MCP) support so you can bridge in external tools. It's tuned for current 2026 models (Qwen3, Qwen3-Coder, Llama 3.3) and changes models with a single .env line.
Why it's valuable: building a reliable tool-calling, RAG-capable local agent from scratch is roughly two weeks of fiddly work — getting tool_call_id threading right, keeping memory from blowing the context window, making retrieval cite sources and refuse instead of hallucinate. This kit hands you that foundation working on day one, so you start at the part that actually pays: wiring an agent to one business's documents and workflow.
Be clear-eyed: this is a head start, not a money machine. Nobody pays for a git clone. The kit removes the build barrier — you still do the consulting work of understanding a client, indexing their data, validating answers, and supporting it. That work is exactly what's worth $3k–$8k per install plus monthly retainers, and the included MONETIZE.md spells out how to package and sell it.
What's included
- Five runnable agent CLIs you can use today: research_agent (plans, web-searches, fetches full pages, writes a cited report), code_review_agent (AST analysis + severity-tagged review using qwen3-coder), data_analysis_agent (loads CSV/JSON, runs pandas, does math, saves charts), knowledge_agent (private RAG over a local folder with citations, fully offline), and orchestrator_agent (delegates sub-tasks to the specialists via handoff)
- core/agent.py — a reusable native tool-calling loop (uses Ollama's tool API directly, not brittle ReAct text-parsing) with correct tool_call_id/name result threading, a plan_first multi-step planner, and agent.register_handoff() for multi-agent delegation
- core/memory.py — bounded, self-summarizing conversation memory so long runs don’t blow the context window (MEMORY_MAX_MESSAGES configurable)
- core/ollama_client.py — a thin, dependency-light Ollama HTTP client (works with plain requests; official ollama SDK optional)
- core/mcp_client.py — a Model Context Protocol bridge that turns any MCP server's tools (filesystem, GitHub, Postgres, web, …) into callable agent tools, namespaced server__tool; strictly additive and skipped cleanly if mcp isn't installed
- Eight pluggable BaseTool implementations in tools/: web_search, web_fetch, file_reader, code_analyzer, data_tools, knowledge_base (local embeddings + on-disk JSON index, no vector DB to run), and python_exec — plus a base.py so you can add a custom client-specific tool in a few lines
- Private RAG that runs with zero infrastructure: local embeddings via nomic-embed-text into a plain on-disk JSON index — no Pinecone, Chroma, or separate vector database to host
- Structured output support — pass a JSON schema and get validated JSON back, for building reliable downstream automations
- config/ with settings.py and prompts.py, plus .env.example — swap models (OLLAMA_MODEL, OLLAMA_CODE_MODEL, OLLAMA_EMBED_MODEL), iteration limits, and temperature with no code changes
- Bundled example data (examples/sales.csv and examples/docs/) so the data and knowledge agents run out of the box for an instant demo
- requirements.txt with current, version-floored deps verified for Python 3.10–3.14 (requests, python-dotenv, ddgs, pandas, matplotlib) and a one-command quickstart in the README
- MONETIZE.md (10KB) — the full playbook: the exact service to sell, realistic 2026 freelance price bands, a copy-and-send Statement of Work, six no-cold-outreach lead sources, and a delivery/validation checklist
- MIT license — use it in client work and commercial projects without restriction
Who it's for
- Freelancers and consultants who want to sell private, offline AI to privacy-bound businesses (law, medical, accounting, defense, education) without building an agent framework first
- Python developers who can run Ollama and read code, but don't want to spend two weeks getting tool-calling, memory, and RAG working reliably
- MSPs and IT shops that already have SMB client relationships but don’t yet offer AI — this is the technical backbone for an add-on service line
- Indie hackers and small agencies building internal automations or productized 'chat with your documents' offerings on local models
- Learners who want a real, non-toy agent codebase to study — native tool calling, planners, and multi-agent handoff done correctly
Use cases
- Install a private 'chat-with-your-documents' assistant on a client's own PC that answers staff questions from their policies, contracts, SOPs, and manuals — with citations and an explicit 'not in the documents' refusal, fully offline
- Build a document/data triage tool that summarizes spreadsheets, flags anomalies, and drafts the weekly numbers email for a small business
- Stand up an internal research and drafting assistant (web disabled for air-gapped clients) for first-pass summaries, intake forms, and response drafts
- Add an automated code-review pass to a dev team's workflow using qwen3-coder, with AST analysis and severity tags
- Bridge MCP servers (filesystem, GitHub, Postgres) into an agent to automate a specific internal workflow for a client
- Prototype your own custom agent in ~5 lines by composing the core Agent with the included tools, then add one client-specific BaseTool as the billable custom work
- Run a live 90-second offline demo on a folder of a prospect's sample documents to close consulting deals
Sell 'private AI that never leaves the building' to firms locked out of cloud AI
The service
A private, offline AI document assistant installed on hardware the client owns. You spec/confirm hardware, install Ollama + this kit, index their documents, tune the prompts and one or two custom tools to their workflow, validate against a 10-question acceptance test, train staff, deliver a runbook — then sell a recurring care plan for re-indexing, updates, and support. The wedge is businesses that legally or contractually cannot send data to ChatGPT/Claude/Gemini: law firms, clinics, accountants, defense subcontractors, schools, NDA-bound agencies, GDPR-nervous EU firms.
What to charge
Paid pilot/proof-of-concept $750–$2,000 (filters tire-kickers, credits toward the install). Standard single-machine install $3,000–$8,000. Larger/multi-seat $8,000–$25,000. The real prize is the recurring care plan at $200–$1,500/month. Ad-hoc/scoping $90–$200/hr. Start at the low end of each band and raise after 2–3 wins. Honest note: these are normal SMB IT-consulting rates, not AI-hype numbers — and it's freelance work you perform, not passive income.
How to find clients
- Tell ten people in your existing network exactly what you now do — you almost certainly already know an accountant, lawyer, clinic manager, or someone at an agency, and one warm intro beats 500 cold emails
- Pick one vertical (e.g. small law firms in your city) and publish one genuinely useful piece — 'Can a law firm use AI without breaking confidentiality? Yes, here's how' — where that vertical already reads
- Run a free 20-minute 'lunch & learn' at where the vertical gathers (local bar association, dental society, chamber of commerce, accountants' meetups) and demo the kit live on a folder of their own sample docs — the offline demo sells itself
- Partner with existing IT providers/MSPs as their AI subcontractor: they bring the trusted client relationship, you do the work, you split the fee
- Be helpful in public — answer 'is it safe to use ChatGPT for X?' questions in industry forums, LinkedIn, and Q&A sites; this is the slow, compounding lead engine that becomes most of your pipeline by year two
- Mine your care-plan retainers for referrals — every happy recurring client is a reference and an intro to their peers
The delivery steps
- Before quoting: see a sample of their real documents, confirm hardware honestly (qwen3 8B runs on a decent recent CPU/iGPU; a 16GB+ VRAM GPU is much faster), and write down their #1 use case plus 10 acceptance-test questions
- Sell a paid pilot first ($750–$2k) — index one document set on their machine and demo offline Q&A in a 1-hour walkthrough; this filters real buyers and credits toward the full job
- Send the Statement of Work from MONETIZE.md (edit the brackets): scope, out-of-scope, an explicit privacy clause, deliverables, 50/50 payment terms, and the honest 'local models aren't a replacement for professional judgment' limitation that wins more deals than it loses
- Install: pull the model + nomic-embed-text, deploy the kit, set .env, index their docs, and tune the system prompt to their domain and citation/refusal rules
- Validate (the billable trust step): run the 10 acceptance questions until answers are correct and cited, deliberately ask 2–3 questions NOT in the docs to prove it refuses instead of hallucinating, and show them it still works with the network cable unplugged
- Hand off with a written runbook, train the actual staff (not just the boss), set the care-plan start date and first re-index reminder, then check in at day 7 and day 30 to capture a testimonial and ask for one intro
How to market it
- Lead with the privacy wedge, not the tech: 'none of your data ever leaves the building' beats any feature list for law/medical/accounting buyers who've already been told no by cloud AI
- Keep a demo laptop with Ollama + this kit pre-indexed on sample docs so you can show it working offline in 90 seconds, anywhere — the live demo converts better than any deck or pitch
- Productize a named offer ('Private AI Document Assistant') with the three price tiers visible so prospects self-select; published bands reduce back-and-forth and anchor the care plan as the obvious next step
- Publish one vertical-specific explainer per quarter answering the exact fear your buyers have ('Is AI safe for HIPAA/confidential/ITAR data?') — content that ranks and gets shared inside the vertical
- Co-market with MSPs and bookkeepers: offer them a referral or revenue split so their existing clients become your pipeline without any cold outreach
- Use the honest written limitation as a selling point — being the vendor who caps liability and refuses to overhype is what privacy-driven clients actually trust and refer
- Turn every install into proof: capture a short testimonial and a before/after ('staff stopped digging through 400 PDFs') and put it on a simple one-page site so referrals have somewhere to land
Frequently asked questions
Do I need to be a machine-learning expert to use this?
No. You need to run Ollama, point the kit at a folder, read Python, and understand one business’s workflow. This is consulting work, not ML research. The hard agent-engineering (tool calling, memory, RAG, planning) is already built and working.
Does it really run fully offline with no API keys?
Yes. Everything runs against a local Ollama instance — the LLM, the embeddings for RAG, and the on-disk index. No cloud API, no keys, and you can prove it to a client by unplugging the network cable mid-demo. The only exception is the optional web_search tool, which you disable for air-gapped clients.
What hardware and models does it need?
The default qwen3 (8B) is a reliable tool-caller on a decent recent CPU/iGPU and is fine for document Q&A. A GPU with 16GB+ VRAM is much faster and lets you run qwen3:14b/30b or llama3.3. You swap models with a single .env line (OLLAMA_MODEL / OLLAMA_CODE_MODEL / OLLAMA_EMBED_MODEL) — no code changes. The README lists verified June 2026 models.
Is this passive income?
No, and the kit says so plainly. Nobody pays for a git clone. The kit removes the two-week build barrier so you can start selling working installs and care plans — but you still do the consulting: scoping, indexing, validation, training, and support. That human work is exactly what’s worth $3k–$8k plus a monthly retainer.
Can I use it in paid client work commercially?
Yes. It’s MIT-licensed, so you can adapt it, ship it on client hardware, and build a commercial service on top of it without restriction. Never commit a client’s .env or their documents.
What’s the difference between this and the RAG Starter Kit?
This kit is about agents — five runnable agents plus a reusable runtime with tool calling, a planner, multi-agent handoff, and MCP, with RAG as one of the included tools. The RAG Starter Kit is a deeper, dedicated document-retrieval pipeline (hybrid search + reranking). Use this when the deliverable is an agent that takes actions; reach for the RAG kit when retrieval quality on large/messy document sets is the whole job.
What exactly do I download?
A zip with the full Python codebase — agents/, the core/ runtime, eight tools/, config/, example data, .env.example, requirements.txt, the README quickstart, and the 10KB MONETIZE.md sales playbook (service definition, price bands, a copy-and-send Statement of Work, lead sources, and a delivery checklist).
After you buy
Purchases are linked to your account — sign in and head to your product library to download anytime. Bought without an account? Check your email for the download link and a one-click way to set a password.
← Back to all kits, tools & codebases