Can a local AI bot really keep up with a busy team?

Yes — with the right hardware. A 16 GB VRAM GPU (RTX 4060 Ti or equivalent) running qwen2.5:14b at OLLAMA_NUM_PARALLEL=4 comfortably serves a 50-200 person team with 5,000-20,000 messages per day. Latency to first token is typically 200-400ms (faster than cloud APIs because the network hop is gone). For larger orgs, 2× RTX 3090 with model parallelism handles 1,000+ users.

Do I need to expose any port for Slack or Discord to reach my bot?

No. Slack uses Socket Mode (the bot connects out via websocket to Slack's servers) and Discord uses outbound websockets too. Your bot machine needs only outbound internet — no inbound port forwarding, no public URL, no SSL certificates. This is a major security win and is one reason this architecture is so popular for self-hosted team bots.

Which model is best for a team Slack/Discord bot?

llama3.1:8b is the default for small teams — runs on 16 GB RAM, replies in under a second. qwen2.5:14b is the production sweet spot for 50-200 user teams: stronger instruction following and coding ability with only 9 GB VRAM at Q4. For deep technical Q&A or large RAG context windows, llama3.3:70b-q4_K_M is worth it if you have a 24+ GB VRAM GPU.

How do I add company docs so the bot answers from them?

Run ChromaDB (one Docker command), embed your docs with nomic-embed-text via Ollama, store the chunks plus embeddings, then in the bot's chat function query ChromaDB for the top 3-5 chunks and inject them into the system prompt. Full code is in the RAG section above. For tuning chunk size, embedding model choice, and re-indexing automation, see our RAG local setup guide.

What about rate limiting so one user doesn't spam the bot?

Implement two layers: per-user request count over a rolling window (e.g., 10 requests per 60 seconds, simple deque per user) and an Ollama-side concurrency cap via OLLAMA_NUM_PARALLEL. For larger teams, also track tokens-per-user-per-day in Redis to spot runaway prompts. Production deployments typically log all interactions for security audit and abuse detection.

Can the bot run on the same machine as Ollama?

Yes, and that is the simplest deployment. The bot process and Ollama can share a single VPS or workstation. For high-traffic deployments, separate them: bot on a small CPU-only machine, Ollama on a GPU machine, connected over your private network. Set OLLAMA_HOST=http:// :11434 in the bot's environment and bind Ollama to 0.0.0.0 with firewall rules limiting source IPs.

How do I prevent the bot from hallucinating about company facts?

Three combined safeguards: (1) lower temperature to 0.2-0.3 for factual queries, (2) use RAG so the model answers from your actual docs instead of training data, (3) add an explicit system prompt rule like 'If you do not know, say so. Do not invent facts about the team or its members.' With all three, hallucinations on factual queries drop dramatically. For anything safety-critical (legal, financial, medical), add a 'this answer requires human review' tag.

What happens if Ollama crashes during a conversation?

The bot should catch the exception and post a graceful error in the thread (the example code does this). Set Ollama's systemd unit to Restart=always so it comes back automatically. For zero-downtime deploys, run two Ollama instances behind a small reverse proxy (Nginx or Caddy) — see our Ollama load balancing guide. Conversation history kept in Redis (not in-memory) survives bot restarts seamlessly.

Build a Local AI Slack & Discord Bot with Ollama (Full Tutorial)

Published on April 23, 2026 • 19 min read

The pitch deck for cloud chatbots is always the same: connect your team chat, get an AI assistant. The fine print is also always the same: the assistant reads your messages, your DMs, your customer data, your internal docs, and the vendor stores it long enough to "improve the service." For most engineering teams that is a non-starter.

I built the first version of this bot for a startup whose CTO had a simple request: "Give my team an AI in Slack that doesn't leak our roadmap." The first iteration took an afternoon. The version this guide describes — with RAG over a private docs folder, per-channel rate limiting, threaded replies, and slash commands — took about two days. It has been running on a $40/month hetzner box for nine months and processed roughly 180,000 messages.

This is the production blueprint. Both Slack and Discord. Real Python code that you can drop into a repo and run today.

Quick Start: 8 Minutes to a Working Bot {#quick-start}

If you just want to see a bot reply in your channel:

# Prerequisites: Python 3.11+, Ollama already installed
ollama pull llama3.1:8b

# Slack version
pip install slack-bolt ollama python-dotenv
export SLACK_BOT_TOKEN=xoxb-...
export SLACK_APP_TOKEN=xapp-...
python slack_bot.py

# Or Discord version
pip install discord.py ollama python-dotenv
export DISCORD_TOKEN=...
python discord_bot.py

The minimal Slack bot (full version below) is 35 lines of Python. By the end of this guide you will have something a real team can use without you babysitting it.

Why Self-Host Your Team AI Bot
Architecture Overview
Hardware & Hosting
Ollama Setup for Bot Workloads
Slack Bot — Full Implementation
Discord Bot — Full Implementation
Adding RAG over Team Docs
Slash Commands & Tools
Rate Limiting & Cost Control
Production Deployment
Pitfalls We Hit (So You Do Not)
FAQs

Why Self-Host Your Team AI Bot {#why-self-host}

Three concrete reasons that came up in customer interviews:

1. Channel content is sensitive by default. Engineering, security, finance, and exec channels routinely contain credentials, customer names, and unannounced product details. Sending them to OpenAI or Anthropic — even with their enterprise privacy promises — creates an audit trail you do not want.

2. Cloud per-message pricing punishes adoption. A 200-person team that adopts an AI bot easily generates 50,000 messages a month routed through the LLM. At GPT-4o pricing that is $300-800/month. A self-hosted bot on a $40 GPU VPS handles the same traffic for the cost of electricity.

3. RAG over private docs needs to stay private. The most useful team bots answer "what does our pricing tier contain?" or "where is the runbook for X service?" That requires indexing internal docs. Cloud RAG means uploading your wiki to a third party. Local RAG keeps it on your hardware.

A well-built local bot is also faster: 200-400ms latency to first token versus 600-1200ms for cloud APIs because the network round trip drops out.

Architecture Overview {#architecture}

┌──────────┐   websocket    ┌────────────┐   HTTP    ┌────────┐
│  Slack   │◄──────────────►│  Bot Proc  │──────────►│ Ollama │
│ Discord  │   socket mode  │  (Python)  │   :11434  │  LLM   │
└──────────┘                └────────────┘           └────────┘
                                  │
                                  ▼
                            ┌─────────────┐
                            │  ChromaDB   │  ← team docs RAG
                            │  (local)    │
                            └─────────────┘

Three components:

Bot process — Python event loop that listens to Slack/Discord events and orchestrates responses.
Ollama — Local LLM server. Same machine or a different one on your network.
ChromaDB (optional) — Vector store for RAG. Docker container on the same host.

No public ingress required. Slack uses Socket Mode and Discord uses websockets, so your bot connects out — no inbound port exposure.

Hardware & Hosting {#hardware}

What you actually need depends on team size:

Team Size	Avg msgs/day	Concurrent	Hardware	Model
1-10	<500	1-2	16 GB RAM, integrated GPU	llama3.1:8b
10-50	1,000-5,000	3-5	32 GB RAM, RTX 3060 12 GB	qwen2.5:7b
50-200	5,000-20,000	5-15	RTX 4060 Ti 16 GB	qwen2.5:14b
200-1000	20,000+	15-40	RTX 4090 or 2× RTX 3090	llama3.3:70b-q4

Concrete VPS picks that work:

Hetzner GEX44 (RTX 4000 Ada, 16 GB VRAM, 64 GB RAM): €184/month — fits 50-200 user teams
Vast.ai RTX 3090: $0.20-0.40/hr on-demand
Self-host on a NUC + eGPU: ~$1,500 one-time, runs forever

For team deployments behind your firewall, see Ollama production deployment.

Ollama Setup for Bot Workloads {#ollama-setup}

Default Ollama settings are tuned for single-user laptops. For a bot serving multiple concurrent users, change three things:

# /etc/systemd/system/ollama.service.d/override.conf
[Service]
Environment="OLLAMA_NUM_PARALLEL=4"            # 4 concurrent requests
Environment="OLLAMA_MAX_LOADED_MODELS=2"       # keep 2 models hot
Environment="OLLAMA_KEEP_ALIVE=24h"            # don't unload between messages
Environment="OLLAMA_HOST=0.0.0.0:11434"        # if bot is on different host

Then:

sudo systemctl daemon-reload
sudo systemctl restart ollama
ollama pull llama3.1:8b
ollama pull nomic-embed-text   # for RAG

Verify it's serving:

curl http://localhost:11434/api/generate -d '{
  "model": "llama3.1:8b",
  "prompt": "Say hello in 5 words"
}'

Slack Bot — Full Implementation {#slack-bot}

Step 1: Create the Slack App

Go to api.slack.com/apps → Create New App → From scratch
Socket Mode: enable it (you don't need a public URL)
OAuth Scopes (Bot Token Scopes): app_mentions:read, chat:write, channels:history, im:history, im:write, commands
Event Subscriptions → enable → subscribe to app_mention, message.im
Slash Commands → create /ai with description "Ask the local AI"
Install to Workspace — copy the Bot Token (starts with xoxb-)
Basic Information → App-Level Tokens → generate one with connections:write scope (starts with xapp-)

Step 2: The Bot Code

# slack_bot.py
import os
import re
import logging
from slack_bolt import App
from slack_bolt.adapter.socket_mode import SocketModeHandler
import ollama

logging.basicConfig(level=logging.INFO)
log = logging.getLogger("slack_bot")

OLLAMA_HOST = os.environ.get("OLLAMA_HOST", "http://localhost:11434")
MODEL = os.environ.get("OLLAMA_MODEL", "llama3.1:8b")

app = App(token=os.environ["SLACK_BOT_TOKEN"])
client = ollama.Client(host=OLLAMA_HOST)

SYSTEM_PROMPT = (
    "You are a helpful assistant for an engineering team in Slack. "
    "Be concise. Format code in fenced blocks. "
    "If you do not know, say so. Do not invent facts about the team."
)

def strip_mention(text: str) -> str:
    return re.sub(r"<@[A-Z0-9]+>", "", text).strip()

def ask_ollama(prompt: str, history: list[dict]) -> str:
    messages = [{"role": "system", "content": SYSTEM_PROMPT}]
    messages.extend(history[-10:])  # keep last 10 turns
    messages.append({"role": "user", "content": prompt})
    resp = client.chat(model=MODEL, messages=messages,
                       options={"temperature": 0.4, "num_predict": 800})
    return resp["message"]["content"].strip()

# Per-thread conversation memory (production: use Redis)
THREAD_HISTORY: dict[str, list[dict]] = {}

@app.event("app_mention")
def on_mention(event, say, client_slack):
    text = strip_mention(event["text"])
    thread_ts = event.get("thread_ts") or event["ts"]
    history = THREAD_HISTORY.setdefault(thread_ts, [])

    # Show "thinking" reaction
    client_slack.reactions_add(channel=event["channel"], name="hourglass_flowing_sand", timestamp=event["ts"])

    try:
        answer = ask_ollama(text, history)
        history.append({"role": "user", "content": text})
        history.append({"role": "assistant", "content": answer})
        say(text=answer, thread_ts=thread_ts)
    except Exception as e:
        log.exception("ollama error")
        say(text=f"Sorry — backend error: {e}", thread_ts=thread_ts)
    finally:
        client_slack.reactions_remove(channel=event["channel"], name="hourglass_flowing_sand", timestamp=event["ts"])

@app.command("/ai")
def slash_ai(ack, respond, command):
    ack()
    prompt = command["text"]
    if not prompt:
        respond("Usage: /ai <your question>")
        return
    answer = ask_ollama(prompt, [])
    respond(text=answer, response_type="in_channel")

@app.event("message")
def on_dm(event, say):
    # Only respond to DMs, ignore channel messages (handled by app_mention)
    if event.get("channel_type") != "im":
        return
    if event.get("subtype") == "bot_message":
        return
    history = THREAD_HISTORY.setdefault(event["channel"], [])
    answer = ask_ollama(event["text"], history)
    history.append({"role": "user", "content": event["text"]})
    history.append({"role": "assistant", "content": answer})
    say(answer)

if __name__ == "__main__":
    handler = SocketModeHandler(app, os.environ["SLACK_APP_TOKEN"])
    log.info("Slack bot starting...")
    handler.start()

That's a fully functional Slack bot. @bot what does our deploy script do? works in any channel where the bot is added, /ai slash command works anywhere, and DMs work too.

Discord Bot — Full Implementation {#discord-bot}

Step 1: Create the Discord App

Discord Developer Portal → New Application
Bot tab → Reset Token → copy it (this is your DISCORD_TOKEN)
Bot tab → enable Message Content Intent (required to read message text)
OAuth2 → URL Generator → scopes: bot, applications.commands → permissions: Send Messages, Read Messages, Add Reactions, Use Slash Commands → invite the bot to your server

Step 2: The Bot Code

# discord_bot.py
import os
import logging
import asyncio
import discord
from discord import app_commands
import ollama

logging.basicConfig(level=logging.INFO)
log = logging.getLogger("discord_bot")

OLLAMA_HOST = os.environ.get("OLLAMA_HOST", "http://localhost:11434")
MODEL = os.environ.get("OLLAMA_MODEL", "llama3.1:8b")

intents = discord.Intents.default()
intents.message_content = True
client_d = discord.Client(intents=intents)
tree = app_commands.CommandTree(client_d)
ollama_client = ollama.Client(host=OLLAMA_HOST)

SYSTEM_PROMPT = "You are a helpful assistant in Discord. Be concise. Use markdown."

CHANNEL_HISTORY: dict[int, list[dict]] = {}

async def ask_ollama_async(prompt: str, history: list[dict]) -> str:
    def _call():
        messages = [{"role": "system", "content": SYSTEM_PROMPT}]
        messages.extend(history[-10:])
        messages.append({"role": "user", "content": prompt})
        return ollama_client.chat(model=MODEL, messages=messages,
                                  options={"temperature": 0.4, "num_predict": 800})
    resp = await asyncio.to_thread(_call)
    return resp["message"]["content"].strip()

@client_d.event
async def on_ready():
    await tree.sync()
    log.info(f"Logged in as {client_d.user}")

@client_d.event
async def on_message(message: discord.Message):
    if message.author == client_d.user or message.author.bot:
        return
    # Respond on mention or DM
    is_dm = isinstance(message.channel, discord.DMChannel)
    is_mention = client_d.user in message.mentions
    if not (is_dm or is_mention):
        return

    prompt = message.content.replace(f"<@{client_d.user.id}>", "").strip()
    if not prompt:
        return

    history = CHANNEL_HISTORY.setdefault(message.channel.id, [])
    async with message.channel.typing():
        try:
            answer = await ask_ollama_async(prompt, history)
            history.append({"role": "user", "content": prompt})
            history.append({"role": "assistant", "content": answer})
            # Discord max message length is 2000 chars
            for i in range(0, len(answer), 1900):
                await message.reply(answer[i:i+1900], mention_author=False)
        except Exception as e:
            log.exception("ollama error")
            await message.reply(f"Backend error: {e}")

@tree.command(name="ai", description="Ask the local AI a question")
async def slash_ai(interaction: discord.Interaction, prompt: str):
    await interaction.response.defer()
    answer = await ask_ollama_async(prompt, [])
    for i in range(0, len(answer), 1900):
        if i == 0:
            await interaction.followup.send(answer[i:i+1900])
        else:
            await interaction.followup.send(answer[i:i+1900])

if __name__ == "__main__":
    client_d.run(os.environ["DISCORD_TOKEN"])

Mention the bot or DM it — it replies. /ai prompt slash command works server-wide. The 1900-char chunking handles long responses (Discord limits messages to 2000 chars).

Adding RAG over Team Docs {#rag}

This is the killer feature. The bot answers questions from your internal docs, runbooks, and wikis instead of generic training data.

Step 1: Run ChromaDB

docker run -d -p 8000:8000 -v chroma-data:/chroma/chroma --name chroma chromadb/chroma:latest

Step 2: Index Your Docs

# index_docs.py
import os, glob, chromadb
import ollama

ollama_client = ollama.Client(host="http://localhost:11434")
chroma = chromadb.HttpClient(host="localhost", port=8000)
coll = chroma.get_or_create_collection(name="team_docs")

def embed(text: str):
    return ollama_client.embeddings(model="nomic-embed-text", prompt=text)["embedding"]

def chunk(text: str, size: int = 800, overlap: int = 100):
    chunks = []
    for i in range(0, len(text), size - overlap):
        chunks.append(text[i:i+size])
    return chunks

for path in glob.glob("./docs/**/*.md", recursive=True):
    with open(path) as f:
        content = f.read()
    for i, ch in enumerate(chunk(content)):
        coll.upsert(
            ids=[f"{path}:{i}"],
            documents=[ch],
            embeddings=[embed(ch)],
            metadatas=[{"source": path, "chunk": i}],
        )
print("Indexed all docs.")

Run python index_docs.py whenever your docs change. For automatic re-indexing on file changes, wrap it in watchdog.

Step 3: RAG-Enabled Chat Function

Replace the ask_ollama function in either bot:

def ask_with_rag(prompt: str, history: list[dict]) -> str:
    q_embedding = ollama_client.embeddings(model="nomic-embed-text", prompt=prompt)["embedding"]
    results = coll.query(query_embeddings=[q_embedding], n_results=5)
    context = "\n\n---\n\n".join(results["documents"][0])

    augmented_system = (
        SYSTEM_PROMPT + "\n\n"
        "Use the following context from internal team docs to answer. "
        "If the answer is not in the context, say so plainly.\n\n"
        f"CONTEXT:\n{context}"
    )
    messages = [{"role": "system", "content": augmented_system}]
    messages.extend(history[-6:])  # shorter history when context is large
    messages.append({"role": "user", "content": prompt})
    resp = ollama_client.chat(model=MODEL, messages=messages,
                              options={"temperature": 0.2, "num_predict": 800})
    return resp["message"]["content"].strip()

Now @bot what is our deploy procedure? returns answers grounded in your actual runbook. For deeper RAG tuning, see RAG local setup guide.

Slash Commands & Tools {#slash-commands}

Add structured commands beyond /ai:

# Slack
@app.command("/summarize")
def summarize_thread(ack, respond, command, client_slack):
    ack()
    channel = command["channel_id"]
    # Fetch last 50 messages
    history = client_slack.conversations_history(channel=channel, limit=50)
    text = "\n".join(m["text"] for m in history["messages"] if "text" in m)
    summary = ask_ollama(f"Summarize this Slack channel in 5 bullets:\n{text}", [])
    respond(summary, response_type="in_channel")

@app.command("/translate")
def translate(ack, respond, command):
    ack()
    args = command["text"].split(" ", 1)
    if len(args) != 2:
        respond("Usage: /translate <lang_code> <text>")
        return
    lang, text = args
    answer = ask_ollama(f"Translate to {lang}: {text}", [])
    respond(answer)

Discord equivalent uses @tree.command decorator with the same logic.

Useful slash commands seen in the wild:

/summarize — collapse a long channel into bullets
/translate — quick translation
/sql — natural language to SQL
/onboard — generate onboarding steps for a new team member
/runbook — pull a runbook by name from RAG

Rate Limiting & Cost Control {#rate-limiting}

Without limits, one curious user will lock up the bot for everyone. The pattern:

import time
from collections import defaultdict, deque

USER_REQUESTS: dict[str, deque] = defaultdict(deque)
USER_RATE_LIMIT = 10      # max requests
USER_RATE_WINDOW = 60     # per 60 seconds
GLOBAL_QUEUE_SIZE = 4     # match OLLAMA_NUM_PARALLEL

def check_rate_limit(user_id: str) -> bool:
    now = time.time()
    q = USER_REQUESTS[user_id]
    while q and q[0] < now - USER_RATE_WINDOW:
        q.popleft()
    if len(q) >= USER_RATE_LIMIT:
        return False
    q.append(now)
    return True

In the message handler:

if not check_rate_limit(event["user"]):
    say("You're sending too fast — try again in a minute.", thread_ts=thread_ts)
    return

For team-wide cost monitoring, track tokens per user in Redis or Postgres. Most teams flag any user exceeding 50,000 tokens/day for review.

For multi-user rate-limiting at the Ollama layer itself, see Ollama rate limiting for multi-user setups.

Production Deployment {#deployment}

systemd unit (Linux)

# /etc/systemd/system/ai-bot.service
[Unit]
Description=Local AI Slack/Discord bot
After=network.target ollama.service

[Service]
Type=simple
User=botuser
WorkingDirectory=/opt/ai-bot
EnvironmentFile=/opt/ai-bot/.env
ExecStart=/opt/ai-bot/venv/bin/python slack_bot.py
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target

sudo systemctl daemon-reload
sudo systemctl enable --now ai-bot
sudo journalctl -u ai-bot -f

Docker

FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python", "slack_bot.py"]

docker build -t ai-bot .
docker run -d --restart always --env-file .env --name ai-bot --network host ai-bot

Monitoring

Bare minimum: log every request, response length, and latency. Plug into Prometheus for real metrics:

from prometheus_client import Counter, Histogram, start_http_server

REQ = Counter("bot_requests_total", "Total requests", ["channel_type"])
LAT = Histogram("bot_latency_seconds", "End-to-end latency")

start_http_server(9100)

For full observability, see Ollama Prometheus + Grafana.

Pitfalls We Hit (So You Do Not) {#pitfalls}

Slack rate limits are per-method, not global. If you call reactions_add on every message, you will hit 429s under load. Cache per-channel reaction state.
Discord intents must be enabled in the developer portal AND in code. Forgetting one or the other causes silent message-ignore behavior with no error.
Streaming responses break Slack. Slack does not support edit-as-you-stream. Buffer the full response then post once.
OLLAMA_KEEP_ALIVE matters. Default is 5 minutes; once unloaded, first message after a quiet period takes 8-15 seconds to respond. Set it to 24h.
Thread history grows unbounded in memory. Use Redis with a TTL of 24 hours. Otherwise expect to restart the bot weekly.
The model will roleplay as your CEO if asked. Add a system prompt rule: "Never impersonate specific people. Never claim to be a human."
Bot owners get DM'd weird stuff. Add an admin command /audit that lets you see anonymized log samples to spot abuse patterns.

Wrap-Up

A self-hosted team chat bot is one of the highest-leverage pieces of software you can build for your company in 2026. It costs roughly nothing to run, gives non-technical employees an AI helper without leaking proprietary data, and turns your internal documentation into something people actually read. The Slack version above is in production at three companies I know of — including the original startup that asked for "an AI in Slack that doesn't leak our roadmap."

Start with the Quick Start. Get a basic mention working. Then add RAG over your handbook. Then add the two slash commands your team uses most. By month three, your team will treat the bot like a colleague. The fact that nobody outside the company can see anything it does is the part you sell to your security team.

Want a deeper integration story? Read Ollama function calling and tool use for action-taking bots, or private OpenAI-compatible API to expose your bot's brain to other apps.

Build a Local AI Slack & Discord Bot with Ollama (Full Tutorial)

Want to go deeper than this article?

Build a Local AI Slack & Discord Bot with Ollama (Full Tutorial)

Quick Start: 8 Minutes to a Working Bot {#quick-start}

Table of Contents

Why Self-Host Your Team AI Bot {#why-self-host}

Architecture Overview {#architecture}

Hardware & Hosting {#hardware}

Ollama Setup for Bot Workloads {#ollama-setup}

Slack Bot — Full Implementation {#slack-bot}

Step 1: Create the Slack App

Step 2: The Bot Code

Discord Bot — Full Implementation {#discord-bot}

Step 1: Create the Discord App

Step 2: The Bot Code

Adding RAG over Team Docs {#rag}

Step 1: Run ChromaDB

Step 2: Index Your Docs

Step 3: RAG-Enabled Chat Function

Slash Commands & Tools {#slash-commands}

Rate Limiting & Cost Control {#rate-limiting}

Production Deployment {#deployment}

systemd unit (Linux)

Docker

Monitoring

Pitfalls We Hit (So You Do Not) {#pitfalls}

Wrap-Up

Go from reading about AI to building with AI

Enjoyed this? There are 10 full courses waiting.

Local AI Master Research Team

Build Real AI on Your Machine

Want structured AI education?

Continue Your Local AI Journey

How to Install Your First Local AI Model

How to Choose the Right AI Model for Your Computer

Comments (0)

Written by Pattanaik Ramswarup

Ship Better Local AI Bots

Build Real AI on Your Machine

🎓 Continue Learning

Related Guides

Ollama Python API Guide

Build a Telegram Bot with Local AI

RAG Local Setup Guide

Ollama Production Deployment

Grab the AI Starter Kit — career roadmap, cheat sheet, setup guide

Go from reading about AI to building with AI