Run Llama 3 Locally in 10 Minutes

Llama 3.1 running on your own laptop in 10 minutes. No API key, no usage limits, no data leaving your machine.

⏱ 10 minutes🛠 8GB RAM minimum (16GB recommended). NVIDIA / Apple Silicon / AMD GPU optional but recommended.

1
Install Ollama
macOS / Linux: `curl -fsSL https://ollama.com/install.sh | sh`. Windows: download the installer from ollama.com. Ollama is a small daemon that handles model downloads, GPU offloading, and an OpenAI-compatible API on localhost:11434. It just works on M-series Macs, NVIDIA GPUs, and falls back to CPU on anything else.
2
Pull the model that fits your hardware
For 8GB RAM: `ollama pull llama3.1:8b-instruct-q4_K_M` — fits in ~5GB and runs at ~30 tokens/sec on M2 / RTX 3060. For 16GB+: `ollama pull llama3.1:8b` (full Q5 quantization, slightly higher quality). For 32GB+: `ollama pull llama3.1:70b-q3_K_M` if you want 70B running locally (slow without a high-end GPU but workable).
3
Your first chat
Run `ollama run llama3.1:8b` and type a question. That's it — you have a private LLM. Try `Explain transformers in 3 sentences` to verify the model is responding sensibly. Exit with /bye.
4
Use it from Python
`pip install ollama`. Then `import ollama; print(ollama.chat(model="llama3.1:8b", messages=[{"role":"user","content":"hi"}])["message"]["content"])`. The same API is OpenAI-compatible at `http://localhost:11434/v1/chat/completions` — point any OpenAI SDK at that base_url and you're running locally.
5
Make it useful
A bare LLM isn't the goal — it's the foundation. From here, you build retrieval (point it at your docs), tools (let it call functions), agents (multi-step reasoning), and fine-tuning (teach it your style or data). The What is AI course covers the entire path; the local setup you just did is chapter 1c.

Want the full path?

Continue with the full What is AI course on Local AI Master.

This page is one chapter of a structured course covering everything from foundations to production. Try Pro free for 7 days — full access to all 264 chapters across 10 courses, no charge until day 8, cancel anytime.

Start 7-day free trial →Preview the course →

No charge for 7 days · $9.99/mo after · cancel anytime

Build something else

Run Llama 3 Locally in 10 Minutes

Install Ollama

Pull the model that fits your hardware

Your first chat

Use it from Python

Make it useful

Continue with the full What is AI course on Local AI Master.