What is the best mini PC for running Ollama in 2026?

The Beelink SER8 with Ryzen 7 8845HS, 32GB DDR5-5600 and 1TB NVMe at around $639 offers the best balance of price, performance and upgrade headroom. It runs Llama 3.1 8B Q4_K_M at about 11 tokens per second.

Can a mini PC really run a usable local LLM?

Yes. AMD Ryzen 8845HS class mini PCs hit 11-12 tokens per second on Llama 3.1 8B Q4_K_M, which is faster than most users read. Phi-3 Mini runs at nearly 30 tokens per second on the same hardware.

AMD or Intel mini PC for local AI?

AMD currently wins. The Radeon 780M iGPU paired with the Ollama Vulkan backend delivers 25-30 percent more tokens per second than Intel Core Ultra Arc on the same model. Intel will likely close the gap as SYCL matures.

Can I add an external GPU to a mini PC for AI?

The Minisforum UM890 Pro has an OCuLink port for native eGPU at full PCIe x4 speeds. The Beelink SER8 has USB4 which supports Thunderbolt enclosures with about a 20-30 percent throughput tax. See our eGPU benchmarks article for measured numbers.

How loud are mini PCs under sustained AI load?

In our tests at 30cm distance: ASUS NUC 14 Pro 33 dBA (very quiet), Beelink SER8 38 dBA (audible), GMKtec NucBox K8 Plus 44 dBA (loudest), Beelink N100 32 dBA (silent because it is also slow).

Does the Intel or AMD NPU help with Ollama?

Not currently. Ollama and llama.cpp do not dispatch transformer inference to NPUs as of version 0.3.x. NPUs are still useful for vision and audio workloads but consider them a future-proofing bonus rather than a current feature.

Should I buy the cheapest 8845HS mini PC?

Only if budget is the deciding factor. The GMKtec NucBox K8 Plus is $90 cheaper than the SER8 with the same CPU, but louder, hotter and with weaker QC reports. For most users the SER8 is worth the extra money.

Best Mini PC for Ollama: 5 Tested Under $800

Q: How much RAM do I need in a mini PC for Ollama?

32GB is the practical minimum. 16GB barely fits an 8B model with reasonable context. 64GB enables 13B class models or keeping two 8B models loaded for hot-swap.

Published on January 22, 2026 • 21 min read

Quick Pick: The Mini PC Most Readers Should Buy

If you want one answer: the Beelink SER8 (Ryzen 7 8845HS, 32GB DDR5-5600, 1TB SSD) at around $639. It runs Llama 3.1 8B Q4_K_M at 11.4 tokens/sec on iGPU+CPU, idles at 9W, and is the cheapest box on this list with 64GB RAM upgrade headroom. It is what we recommend to readers who do not have a GPU and do not want one.

The other four boxes win specific niches — quietest, lowest idle, most upgradeable, fastest CPU-only — and we cover them below with real benchmark numbers, not spec sheet theater.

What changed since 2024:

AMD's Ryzen 8045/8845HS series with RDNA 3 iGPU (12 CU) doubled iGPU inference speed
Intel Core Ultra "Meteor Lake" Arc iGPU added matmul acceleration (still behind AMD on Ollama)
DDR5-5600/6400 became standard at this price point — bandwidth matters more than cores for LLMs
Ollama 0.3.x added stable Vulkan offload, finally letting integrated GPUs participate

If you have never sized a model to a memory budget before, skim our hardware requirements primer first — the rest of this guide assumes you understand why a 7B model needs ~5GB of RAM at Q4 and why context length blows that number up.

Test Methodology
The Five Mini PCs Tested
Benchmark Results: Llama 3.1 8B and Phi-3
Idle Power, Thermals, and Noise
Detailed Reviews
Which One Should You Actually Buy?
Setup: Ollama on a Fresh Mini PC
Common Pitfalls
Frequently Asked Questions

Test Methodology {#methodology}

Every box was tested identically:

Model: Llama 3.1 8B Instruct, Q4_K_M (4.58 GiB)
Secondary model: Phi-3 Mini 3.8B, Q4_K_M
Runner: Ollama 0.3.14
Prompt: Fixed 512-token prompt, 256-token output, temperature 0, seed 42
Iterations: 5 runs, first dropped, mean reported
Power measurement: Kill-A-Watt P3 between wall and PSU brick
Noise measurement: UNI-T UT353 sound meter at 30 cm
Ambient: 22°C, sealed room

The full methodology lives in our local AI benchmarking guide. If your numbers do not look like ours, that is the place to start debugging.

External validation: AMD's Ryzen 8000 series spec page publishes the iGPU compute units and TDP envelopes we cross-checked against actual draw.

The Five Mini PCs Tested {#contenders}

#	Mini PC	CPU	RAM	Price	Why it made the list
1	Beelink SER8	Ryzen 7 8845HS (8C/16T, RDNA 3 12CU)	32GB DDR5-5600	$639	Best balance of price, RAM, and iGPU
2	Minisforum UM890 Pro	Ryzen 9 8945HS (8C/16T, RDNA 3 12CU)	32GB DDR5-5600	$789	Faster CPU, identical iGPU, dual SSD
3	GMKtec NucBox K8 Plus	Ryzen 7 8845HS (8C/16T, RDNA 3 12CU)	32GB DDR5-5600	$549	Cheapest 8845HS box on Amazon
4	ASUS NUC 14 Pro (Core Ultra 7 155H)	Intel Core Ultra 7 155H	32GB DDR5-5600	$799	Meteor Lake Arc iGPU + NPU
5	Beelink Mini S13 (N100)	Intel N100 (4C/4T, UHD iGPU)	16GB DDR5-4800	$189	Floor model — establishes "do not bother" baseline

All units tested with stock RAM and SSD. We did not overclock. We did not undervolt. We benchmarked them as they ship.

Benchmark Results: Llama 3.1 8B and Phi-3 {#benchmarks}

Llama 3.1 8B Instruct (Q4_K_M) — single stream

Mini PC	Backend	Prefill (tok/s)	Generation (tok/s)	TTFT (ms)	Peak RAM (GB)
Beelink SER8	Vulkan + CPU	132	11.4	412	6.1
Minisforum UM890 Pro	Vulkan + CPU	138	12.1	398	6.1
GMKtec NucBox K8 Plus	Vulkan + CPU	130	11.2	420	6.1
ASUS NUC 14 Pro	SYCL + CPU	98	8.7	540	6.0
Beelink Mini S13 (N100)	CPU only	22	2.1	2,140	5.9

Phi-3 Mini 3.8B (Q4_K_M) — single stream

Mini PC	Backend	Prefill (tok/s)	Generation (tok/s)	TTFT (ms)
Beelink SER8	Vulkan + CPU	281	28.6	198
Minisforum UM890 Pro	Vulkan + CPU	294	30.2	188
GMKtec NucBox K8 Plus	Vulkan + CPU	278	28.1	202
ASUS NUC 14 Pro	SYCL + CPU	218	21.4	254
Beelink Mini S13 (N100)	CPU only	64	7.8	712

Reading the table: Phi-3 is usable on every box including the N100. Llama 3.1 8B is comfortable on the AMD trio, slightly slow on the Intel Core Ultra, and frustrating on the N100. None of the AMD boxes can outrun a discrete RTX 3060 (which does ~50 tok/s on the same workload), but at sub-15W idle they do not need to.

If you are choosing between a mini PC and a small tower, our budget local AI machine guide compares both classes head-to-head.

Idle Power, Thermals, and Noise {#power}

Mini PCs live or die on idle wattage and noise. A box that pulls 30W idle is a $90/year electric bill in the US, plus a fan you hear at night.

Mini PC	Idle (W)	Loaded (W)	Loaded fan dBA	Loaded CPU temp (°C)
Beelink SER8	9	54	38	79
Minisforum UM890 Pro	11	58	41	81
GMKtec NucBox K8 Plus	10	55	44	84
ASUS NUC 14 Pro	8	49	33	76
Beelink Mini S13 (N100)	6	19	32	71

The NUC 14 Pro is meaningfully quieter and cooler than the AMD boxes — that is what you are paying the premium for. The GMKtec is loudest under load and runs the hottest; bring a good location with airflow.

Detailed Reviews {#reviews}

1. Beelink SER8 — The Sweet Spot ($639)

Configuration tested: Ryzen 7 8845HS, 32GB DDR5-5600 (2x16GB SODIMM), 1TB NVMe Gen4

The good:

Highest tokens/sec per dollar in the test
Two SODIMM slots accept 2x32GB = 64GB upgrade ($120 today)
Two NVMe slots — boot drive plus a model drive
USB4 port lets you bolt on an eGPU later
Vulkan backend works out of the box on Ubuntu 24.04

The bad:

Fan is not silent — audible from 1m
BIOS does not expose iGPU memory allocation; defaults are fine but locked
Wi-Fi 6 module is solid, but Bluetooth 5.2 occasionally drops on Linux

Verdict: This is the box for the 80% case. Quiet enough for a desk shelf, fast enough to run an 8B coding assistant, cheap enough to keep one as a homelab node.

2. Minisforum UM890 Pro — For Mixed Workloads ($789)

Configuration tested: Ryzen 9 8945HS, 32GB DDR5-5600, 1TB NVMe Gen4

The good:

6% faster CPU than 8845HS — noticeable on prefill
Dual 2.5GbE — useful if you are clustering (see our distributed inference guide)
OCuLink port for native external GPU
Better build quality and thermals than the GMKtec

The bad:

$150 more than the SER8 for marginally faster inference
iGPU is identical to the SER8 (Radeon 780M, 12 CU) — Llama 3.1 8B sees only single-digit % uplift

Verdict: Buy this one if you want the OCuLink port for a future eGPU, or if you are deploying a small fleet and need dual-NIC.

3. GMKtec NucBox K8 Plus — Cheapest of the AMD Trio ($549)

Configuration tested: Ryzen 7 8845HS, 32GB DDR5-5600, 1TB NVMe Gen4

The good:

$90 cheaper than the SER8 with the same CPU and same RAM
Adequate inference performance — within 2% of the SER8
Good Linux support after a kernel 6.8+ install

The bad:

Fan is the loudest on the test (44 dBA under load) and pitchy
84°C under sustained load is too close to throttling for comfort
BIOS updates are infrequent and have to be flashed manually
We have read multiple QC complaints — get extended warranty

Verdict: Buy if budget is the deciding factor and you are comfortable shipping it back if you draw a bad unit.

4. ASUS NUC 14 Pro — Quietest, Slowest at 8B ($799)

Configuration tested: Core Ultra 7 155H, 32GB DDR5-5600, 1TB NVMe Gen4

The good:

The quietest box in the test (33 dBA under load) — genuinely silent at idle
8W idle — lowest among the 8B-capable boxes
Intel NPU available via OpenVINO for compatible workloads
ASUS firmware support is the best of any vendor here

The bad:

24% slower than AMD on Llama 3.1 8B because Intel Arc SYCL backend is less mature
$799 with only single-channel-tested RAM stick on some SKUs — verify before purchase
NPU is largely unused by Ollama today

Verdict: Buy this if your office requires silent operation and you can live with 8.7 tok/s on 8B models. For Phi-3 and smaller it is plenty fast.

5. Beelink Mini S13 (N100) — The Floor ($189)

Configuration tested: Intel N100, 16GB DDR5-4800, 500GB NVMe

The good:

$189 with RAM and SSD included
Idle at 6W — nothing else even comes close
Runs Phi-3 Mini at usable speeds (~8 tok/s)

The bad:

2.1 tok/s on Llama 3.1 8B is too slow for interactive use
16GB RAM ceiling on most SKUs — single SODIMM
iGPU offers zero meaningful inference acceleration

Verdict: Don't buy this for Ollama at 8B. Buy it as a Home Assistant box that occasionally runs Phi-3 for voice commands.

Which One Should You Actually Buy? {#verdict}

Use case	Recommendation
Single-user 8B coding assistant on a desk	Beelink SER8
Silent office, willing to use Phi-3 / 7B	ASUS NUC 14 Pro
Want to add an eGPU later	Minisforum UM890 Pro (OCuLink)
Tightest budget that still runs 8B	GMKtec NucBox K8 Plus
Voice assistant / smart home only	Beelink Mini S13 (N100)
Multi-user team box	None — buy a tower with a real GPU

If you want to run a 13B or 70B model, no mini PC at this price point is the right tool. Get a used RTX 3090 build instead.

Setup: Ollama on a Fresh Mini PC {#setup}

The path that gets you fastest tokens on AMD iGPU boxes is Vulkan offload. Here is the exact recipe on Ubuntu 24.04:

# 1. Update kernel and install Mesa/Vulkan
sudo apt update && sudo apt -y full-upgrade
sudo apt install -y mesa-vulkan-drivers vulkan-tools

# 2. Confirm the iGPU is visible
vulkaninfo --summary | head -30

# 3. Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# 4. Tell Ollama to use Vulkan
sudo systemctl edit ollama.service
# add:
# [Service]
# Environment="OLLAMA_NUM_GPU=999"
# Environment="OLLAMA_VULKAN=1"

sudo systemctl daemon-reload
sudo systemctl restart ollama

# 5. Pull and run
ollama pull llama3.1:8b-instruct-q4_K_M
ollama run llama3.1:8b-instruct-q4_K_M --verbose "Hello"

You should see eval rate ~10-12 tokens/sec on the AMD trio. If you see ~3-4 tok/s, Ollama is running on CPU only — re-check Vulkan visibility.

For Intel NUC 14 Pro (Arc iGPU), use Ollama's IPEX path instead:

sudo apt install -y intel-opencl-icd intel-level-zero-gpu
clinfo | grep "Device Name"   # confirm Arc shows up
# Set in service file:
# Environment="ONEAPI_DEVICE_SELECTOR=level_zero:gpu"

Once running, validate with our benchmark playbook so you can confirm your numbers match this article.

Common Pitfalls {#pitfalls}

1. Single-channel RAM

Some SKUs ship with a single 32GB SODIMM. Inference is memory-bandwidth-bound. Insist on 2x16GB or 2x32GB dual-channel.

2. Buying 16GB and "upgrading later"

With 16GB, the OS and Ollama leave you ~10GB for models. That barely fits Llama 3.1 8B Q4. Skip the regret cycle and buy 32GB up front.

3. NVMe heat throttling

Mini PCs are tight. The NVMe slot is often unventilated. Loading a 70B model from a thermal-throttling SSD turns into minutes of wait. Use a low-profile NVMe heatsink if your case allows it.

4. Wi-Fi-only deployments

Pulling a 40GB model over Wi-Fi is painful. Plug into Ethernet for the initial pull, then go wireless if you must.

5. Underestimating cooling

A mini PC pinned to 95% CPU + iGPU load for 30 minutes will heat-soak. Performance can drop 15-20% after the first 10 minutes. Bench the steady-state number, not the first run.

6. Forgetting to disable suspend

Default GNOME settings will suspend the machine after 20 minutes of inactivity, killing background Ollama. Disable suspend on all server-style mini PCs:

sudo systemctl mask sleep.target suspend.target hibernate.target hybrid-sleep.target

Frequently Asked Questions {#faq}

Q: Can a mini PC really run Llama 3.1 8B at usable speeds?

A: Yes — the AMD 8845HS / 8945HS class hits 11-12 tokens/sec on Q4_K_M, which is faster than most people read. Larger models (13B, 70B) are not realistic on iGPU.

Q: AMD or Intel mini PC for Ollama?

A: AMD wins today. RDNA 3 (Radeon 780M) plus the Vulkan backend gives 25-30% more tokens/sec than Intel Core Ultra Arc on Ollama. That gap will close as Intel's SYCL backend matures, but as of Ollama 0.3.x, AMD is the practical pick.

Q: Do I need a discrete GPU if I have a mini PC?

A: Not for Phi-3 / Llama 3.1 8B class workloads. If you need 13B or larger, or want concurrent users, yes — at that point look at our eGPU benchmarks or a tower build.

Q: How much RAM do I need in a mini PC for Ollama?

A: 32GB is the sweet spot. 16GB barely fits 8B models with a long context. 64GB lets you run 13B or keep two 8B models warm.

Q: Can I run multiple users from a mini PC?

A: One concurrent user comfortably. Two users will see TTFT spike past 3 seconds. For real multi-user, you need a discrete GPU and vLLM — see our Ollama rate limiting guide.

Q: How loud are these under load?

A: 33-44 dBA at 30 cm. The ASUS NUC 14 Pro is desk-quiet. The GMKtec is the loudest. The Beelink SER8 is in the middle — audible but not distracting at typing volume.

Q: Does the NPU help with Ollama?

A: Not yet. Both Intel and AMD NPUs are excellent for vision and audio models, but Ollama and llama.cpp do not currently dispatch transformer inference to NPUs. Treat the NPU as a future-proofing bonus, not a feature you can use today.

Q: Can I add a GPU to these mini PCs?

A: The Minisforum UM890 Pro has OCuLink for a native eGPU connection. The Beelink SER8 has USB4 (40 Gbps) which works with most Thunderbolt eGPU enclosures at a 20-30% throughput tax. The N100 is too I/O-limited to bother.

Conclusion

The mini PC market in 2026 is the best it has ever been for local AI. A $639 Beelink SER8 runs Llama 3.1 8B faster than a 2023 desktop with discrete graphics, in a chassis the size of a paperback, drawing less power than a laptop charger. That is genuinely new.

Start with the SER8 unless you have a specific reason to deviate (silence — NUC 14 Pro; eGPU plans — UM890 Pro; budget — GMKtec). Pair it with our Ollama setup guide and you will be running local AI within an hour of the box arriving.

Want our updated mini PC benchmark sheet whenever a new chassis lands? Subscribe to the LocalAimaster newsletter — we publish full numbers within 72 hours of every relevant release.

Best Mini PC for Ollama: 5 Tested Under $800 (2026)

Want to go deeper than this article?

Best Mini PC for Ollama: 5 Tested Under $800

Quick Pick: The Mini PC Most Readers Should Buy

Table of Contents

Test Methodology {#methodology}

The Five Mini PCs Tested {#contenders}

Benchmark Results: Llama 3.1 8B and Phi-3 {#benchmarks}

Llama 3.1 8B Instruct (Q4_K_M) — single stream

Phi-3 Mini 3.8B (Q4_K_M) — single stream

Idle Power, Thermals, and Noise {#power}

Detailed Reviews {#reviews}

1. Beelink SER8 — The Sweet Spot ($639)

2. Minisforum UM890 Pro — For Mixed Workloads ($789)

3. GMKtec NucBox K8 Plus — Cheapest of the AMD Trio ($549)

4. ASUS NUC 14 Pro — Quietest, Slowest at 8B ($799)

5. Beelink Mini S13 (N100) — The Floor ($189)

Which One Should You Actually Buy? {#verdict}

Setup: Ollama on a Fresh Mini PC {#setup}

Common Pitfalls {#pitfalls}

1. Single-channel RAM

2. Buying 16GB and "upgrading later"

3. NVMe heat throttling

4. Wi-Fi-only deployments

5. Underestimating cooling

6. Forgetting to disable suspend

Frequently Asked Questions {#faq}

Q: Can a mini PC really run Llama 3.1 8B at usable speeds?

Q: AMD or Intel mini PC for Ollama?

Q: Do I need a discrete GPU if I have a mini PC?

Q: How much RAM do I need in a mini PC for Ollama?

Q: Can I run multiple users from a mini PC?

Q: How loud are these under load?

Q: Does the NPU help with Ollama?

Q: Can I add a GPU to these mini PCs?

Conclusion

Go from reading about AI to building with AI

Enjoyed this? There are 10 full courses waiting.

LocalAimaster Research Team

Build Real AI on Your Machine

Want structured AI education?

Continue Your Local AI Journey

How to Install Your First Local AI Model

How to Choose the Right AI Model for Your Computer

Comments (0)

Written by Pattanaik Ramswarup

🎓 Continue Learning

Mini PC Benchmark Updates

Related Guides

Build Real AI on Your Machine

Continue Learning

Hardware Requirements

Complete Ollama Guide

Models for 8GB Systems

Grab the AI Starter Kit — career roadmap, cheat sheet, setup guide

Go from reading about AI to building with AI