Free Tool · No Signup

Can I Run Local AI? Yes/No Hardware Checker

Pick your operating system, how much RAM you have, your GPU type, and (for NVIDIA/AMD cards) your VRAM. You get an instant YES or NO, the runtime to install (Ollama or LM Studio), the biggest model class that realistically fits, and the GPU backend you'll set up (CUDA, ROCm/Vulkan, or Metal). This is an OS + runtime feasibility check — for exact gigabytes per model, use the VRAM Calculator.

📅 Published: June 20, 2026🔄 Last Updated: June 20, 2026✓ Manually Reviewed

1 · Operating system

How the checker decides

Running a local model is two questions stacked on top of each other. First: is there a runtime for my machine at all? That answer is almost always yes — both Ollama and LM Studio ship for Windows, macOS (Apple Silicon) and Linux, and both fall back to CPU when there's no usable GPU. The real gate is the second question: how much model can my memory hold?

That's why the tool keys off whichever memory number actually limits you. On a discrete NVIDIA or AMD card, that's VRAM. On Apple Silicon there is no separate VRAM — the GPU and CPU share one unified memory pool, so your RAM is the budget. With no GPU or only integrated graphics, the model runs on the CPU out of system RAM. We size against the community rule of thumb that a model needs roughly 0.6 GB per billion parameters at Q4_K_M quantization (the default most runtimes pull), plus headroom for context. Those numbers are approximate and verified against published specs and the GGUF community — see the Ollama RAM/VRAM table for the per-model breakdown.

Model classes by usable memory (Q4_K_M, approx)

Usable memory	Biggest model class	Example models
< 4 GB	Tiny only (1–3B)	Gemma 3 1B, Phi-3.5 mini — slow/marginal
4–8 GB	7–8B (~5 GB)	Llama 3.1 8B, Qwen 2.5 7B, Mistral 7B
8–16 GB	13–14B (~8–10 GB)	Phi-4 14B, Qwen 2.5 14B, CodeStral 22B (tight)
16–24 GB	~32B (~20 GB)	Qwen 2.5 32B, Gemma 2 27B
24–48 GB	32B comfortably / 70B partial	Qwen 2.5 32B (fast), Llama 3.3 70B (offloaded)
48 GB+	70B (~40 GB)	Llama 3.3 70B, Qwen 2.5 72B

Worked examples

Windows · 16GB RAM · RTX 3060 · 12GB VRAM

YES. Runtime: LM Studio (or Ollama). Backend: CUDA. Biggest class: 13–14B at Q4. Comfortable, GPU-accelerated.

macOS · 8GB RAM · Apple Silicon (M2)

YES, with limits. Runtime: Ollama or LM Studio. Backend: Metal (unified memory). Biggest class: 7–8B at Q4 — keep other apps light. See the Mac setup guide.

Linux (Ubuntu) · 32GB RAM · no GPU

YES, but CPU-only. Runtime: Ollama. Backend: CPU. Biggest usable class: 7–8B (slow, single-digit tok/s). Read Can I run AI on Ubuntu?

Windows · 64GB RAM · RX 7900 XTX · 24GB VRAM

YES. Runtime: LM Studio (Vulkan) or Ollama (ROCm). Backend: ROCm/Vulkan. Biggest class: ~32B at Q4 on the GPU.

Setup backends in plain terms

CUDA — NVIDIA's GPU compute layer. Install a recent NVIDIA driver and Ollama/LM Studio detects the GPU automatically. The smoothest path.
ROCm / Vulkan — the AMD path. Ollama uses ROCm for Radeon RX / PRO discrete cards on Windows and Linux; LM Studio adds a Vulkan backend for broader AMD coverage, including integrated graphics.
Metal — Apple's GPU framework. On Apple Silicon it's used automatically with unified memory; nothing to install beyond the runtime.
CPU — the universal fallback. No GPU needed, but expect slow generation on anything above a 7–8B model.

Frequently asked questions

What does this checker actually tell me?

It answers one question first — yes or no, can your machine run local AI at all — and then tells you which runtime to install (Ollama or LM Studio), the biggest model class that realistically fits your memory, and the GPU backend you will be using (CUDA for NVIDIA, ROCm or Vulkan for AMD, or Metal for Apple Silicon). It is a feasibility check, not exact VRAM math. For the precise gigabyte budget of a specific model and quantization, use the VRAM Calculator.

Why does it ask for both RAM and VRAM?

On a machine with a discrete NVIDIA or AMD GPU, model speed is gated by VRAM — that is the memory the model has to fit into to run fast. On Apple Silicon there is no separate VRAM: the GPU shares one unified memory pool with the CPU, so your RAM number is what matters. On a machine with no GPU or only integrated graphics, the model runs on the CPU out of system RAM. So the checker uses whichever number actually limits you.

Is "yes" the same as "fast"?

No. A yes means the runtime will install and a model in that class will load and produce tokens. CPU-only and low-VRAM setups can be genuinely slow — single-digit tokens per second on bigger models. The checker flags when you are in usable-but-slow territory versus comfortable territory so you know what to expect before you download 5GB of model weights.

Ollama or LM Studio — which one does it recommend, and why?

Both run the same open-weight models locally and both work on Windows, macOS and Linux. The checker leans toward Ollama for a command-line / server-style setup (great for scripting, APIs and Linux), and toward LM Studio when you want a polished graphical app to browse, download and chat with models — which is the friendlier first step on Windows and Mac. Either is a valid answer; the recommendation is about fit, not capability.

My GPU is AMD — can I still run local AI?

Yes. Ollama supports AMD Radeon RX / PRO discrete GPUs through ROCm on both Windows and Linux, and LM Studio adds a Vulkan backend that broadens AMD support further, including on Windows. The checker treats AMD VRAM roughly the same as NVIDIA VRAM for sizing, and notes the ROCm / Vulkan setup step. Integrated Ryzen graphics fall back to the CPU/Vulkan path and are sized like a no-GPU machine.

Is this tool free?

Yes — free, no signup, no rate limits. The logic is deterministic, so the same inputs always produce the same verdict. Nothing you enter leaves your browser; the entire check runs client-side with no network calls.

Got a green light? Here's the install

Once you know you can run local AI, the next step is a clean install. Our complete Ollama guide walks through installing the runtime, pulling your first model, and running it — on Windows, macOS or Linux — in about ten minutes.

Read the complete Ollama guide →

Related tools & resources

→ VRAM Calculator — exact VRAM needs for any model + quantization
→ AI Model Finder — match your hardware to the right model
→ Can I run AI on Ubuntu? — Linux feasibility, step by step
→ Complete Ollama guide — install + first model
→ Ollama system requirements — exact RAM/GPU/CPU specs
→ Best local AI models for 8GB RAM — if you're memory-limited

🎯

AI Learning Path

Go from reading about AI to building with AI

20 structured courses. Hands-on projects. Runs on your machine. Start free.

Start free Browse courses first

Or own it for life — Lifetime $149 $599, pay once

Training your whole team? Get a team quote →

Written by the Local AI Master Team

The team behind Local AI Master

We build Local AI Master around practical, testable local AI workflows: model selection, hardware planning, RAG systems, agents, and MLOps. The goal is to turn scattered tutorials into a structured learning path you can follow on your own hardware.

✓ Local AI Curriculum✓ Hands-On Projects✓ Open Source Contributor

GitHub LinkedIn Twitter