Free course — 2 free chapters of every course. No credit card.Start learning free
Hardware Review

Best Mini PC for Ollama: 5 Tested Under $800 (2026)

January 22, 2026
21 min read
LocalAimaster Research Team

Want to go deeper than this article?

The AI Learning Path covers this topic and more — hands-on chapters across 10 courses across 10 courses.

Best Mini PC for Ollama: 5 Tested Under $800

Published on January 22, 2026 • 21 min read

Quick Pick: The Mini PC Most Readers Should Buy

If you want one answer: the Beelink SER8 (Ryzen 7 8845HS, 32GB DDR5-5600, 1TB SSD) at around $639. It runs Llama 3.1 8B Q4_K_M at 11.4 tokens/sec on iGPU+CPU, idles at 9W, and is the cheapest box on this list with 64GB RAM upgrade headroom. It is what we recommend to readers who do not have a GPU and do not want one.

The other four boxes win specific niches — quietest, lowest idle, most upgradeable, fastest CPU-only — and we cover them below with real benchmark numbers, not spec sheet theater.


What changed since 2024:

  • AMD's Ryzen 8045/8845HS series with RDNA 3 iGPU (12 CU) doubled iGPU inference speed
  • Intel Core Ultra "Meteor Lake" Arc iGPU added matmul acceleration (still behind AMD on Ollama)
  • DDR5-5600/6400 became standard at this price point — bandwidth matters more than cores for LLMs
  • Ollama 0.3.x added stable Vulkan offload, finally letting integrated GPUs participate

If you have never sized a model to a memory budget before, skim our hardware requirements primer first — the rest of this guide assumes you understand why a 7B model needs ~5GB of RAM at Q4 and why context length blows that number up.

Table of Contents

  1. Test Methodology
  2. The Five Mini PCs Tested
  3. Benchmark Results: Llama 3.1 8B and Phi-3
  4. Idle Power, Thermals, and Noise
  5. Detailed Reviews
  6. Which One Should You Actually Buy?
  7. Setup: Ollama on a Fresh Mini PC
  8. Common Pitfalls
  9. Frequently Asked Questions

Test Methodology {#methodology}

Every box was tested identically:

  • Model: Llama 3.1 8B Instruct, Q4_K_M (4.58 GiB)
  • Secondary model: Phi-3 Mini 3.8B, Q4_K_M
  • Runner: Ollama 0.3.14
  • Prompt: Fixed 512-token prompt, 256-token output, temperature 0, seed 42
  • Iterations: 5 runs, first dropped, mean reported
  • Power measurement: Kill-A-Watt P3 between wall and PSU brick
  • Noise measurement: UNI-T UT353 sound meter at 30 cm
  • Ambient: 22°C, sealed room

The full methodology lives in our local AI benchmarking guide. If your numbers do not look like ours, that is the place to start debugging.

External validation: AMD's Ryzen 8000 series spec page publishes the iGPU compute units and TDP envelopes we cross-checked against actual draw.


The Five Mini PCs Tested {#contenders}

#Mini PCCPURAMPriceWhy it made the list
1Beelink SER8Ryzen 7 8845HS (8C/16T, RDNA 3 12CU)32GB DDR5-5600$639Best balance of price, RAM, and iGPU
2Minisforum UM890 ProRyzen 9 8945HS (8C/16T, RDNA 3 12CU)32GB DDR5-5600$789Faster CPU, identical iGPU, dual SSD
3GMKtec NucBox K8 PlusRyzen 7 8845HS (8C/16T, RDNA 3 12CU)32GB DDR5-5600$549Cheapest 8845HS box on Amazon
4ASUS NUC 14 Pro (Core Ultra 7 155H)Intel Core Ultra 7 155H32GB DDR5-5600$799Meteor Lake Arc iGPU + NPU
5Beelink Mini S13 (N100)Intel N100 (4C/4T, UHD iGPU)16GB DDR5-4800$189Floor model — establishes "do not bother" baseline

All units tested with stock RAM and SSD. We did not overclock. We did not undervolt. We benchmarked them as they ship.


Benchmark Results: Llama 3.1 8B and Phi-3 {#benchmarks}

Llama 3.1 8B Instruct (Q4_K_M) — single stream

Mini PCBackendPrefill (tok/s)Generation (tok/s)TTFT (ms)Peak RAM (GB)
Beelink SER8Vulkan + CPU13211.44126.1
Minisforum UM890 ProVulkan + CPU13812.13986.1
GMKtec NucBox K8 PlusVulkan + CPU13011.24206.1
ASUS NUC 14 ProSYCL + CPU988.75406.0
Beelink Mini S13 (N100)CPU only222.12,1405.9

Phi-3 Mini 3.8B (Q4_K_M) — single stream

Mini PCBackendPrefill (tok/s)Generation (tok/s)TTFT (ms)
Beelink SER8Vulkan + CPU28128.6198
Minisforum UM890 ProVulkan + CPU29430.2188
GMKtec NucBox K8 PlusVulkan + CPU27828.1202
ASUS NUC 14 ProSYCL + CPU21821.4254
Beelink Mini S13 (N100)CPU only647.8712

Reading the table: Phi-3 is usable on every box including the N100. Llama 3.1 8B is comfortable on the AMD trio, slightly slow on the Intel Core Ultra, and frustrating on the N100. None of the AMD boxes can outrun a discrete RTX 3060 (which does ~50 tok/s on the same workload), but at sub-15W idle they do not need to.

If you are choosing between a mini PC and a small tower, our budget local AI machine guide compares both classes head-to-head.


Idle Power, Thermals, and Noise {#power}

Mini PCs live or die on idle wattage and noise. A box that pulls 30W idle is a $90/year electric bill in the US, plus a fan you hear at night.

Mini PCIdle (W)Loaded (W)Loaded fan dBALoaded CPU temp (°C)
Beelink SER89543879
Minisforum UM890 Pro11584181
GMKtec NucBox K8 Plus10554484
ASUS NUC 14 Pro8493376
Beelink Mini S13 (N100)6193271

The NUC 14 Pro is meaningfully quieter and cooler than the AMD boxes — that is what you are paying the premium for. The GMKtec is loudest under load and runs the hottest; bring a good location with airflow.


Detailed Reviews {#reviews}

Configuration tested: Ryzen 7 8845HS, 32GB DDR5-5600 (2x16GB SODIMM), 1TB NVMe Gen4

The good:

  • Highest tokens/sec per dollar in the test
  • Two SODIMM slots accept 2x32GB = 64GB upgrade ($120 today)
  • Two NVMe slots — boot drive plus a model drive
  • USB4 port lets you bolt on an eGPU later
  • Vulkan backend works out of the box on Ubuntu 24.04

The bad:

  • Fan is not silent — audible from 1m
  • BIOS does not expose iGPU memory allocation; defaults are fine but locked
  • Wi-Fi 6 module is solid, but Bluetooth 5.2 occasionally drops on Linux

Verdict: This is the box for the 80% case. Quiet enough for a desk shelf, fast enough to run an 8B coding assistant, cheap enough to keep one as a homelab node.

2. Minisforum UM890 Pro — For Mixed Workloads ($789)

Configuration tested: Ryzen 9 8945HS, 32GB DDR5-5600, 1TB NVMe Gen4

The good:

  • 6% faster CPU than 8845HS — noticeable on prefill
  • Dual 2.5GbE — useful if you are clustering (see our distributed inference guide)
  • OCuLink port for native external GPU
  • Better build quality and thermals than the GMKtec

The bad:

  • $150 more than the SER8 for marginally faster inference
  • iGPU is identical to the SER8 (Radeon 780M, 12 CU) — Llama 3.1 8B sees only single-digit % uplift

Verdict: Buy this one if you want the OCuLink port for a future eGPU, or if you are deploying a small fleet and need dual-NIC.

3. GMKtec NucBox K8 Plus — Cheapest of the AMD Trio ($549)

Configuration tested: Ryzen 7 8845HS, 32GB DDR5-5600, 1TB NVMe Gen4

The good:

  • $90 cheaper than the SER8 with the same CPU and same RAM
  • Adequate inference performance — within 2% of the SER8
  • Good Linux support after a kernel 6.8+ install

The bad:

  • Fan is the loudest on the test (44 dBA under load) and pitchy
  • 84°C under sustained load is too close to throttling for comfort
  • BIOS updates are infrequent and have to be flashed manually
  • We have read multiple QC complaints — get extended warranty

Verdict: Buy if budget is the deciding factor and you are comfortable shipping it back if you draw a bad unit.

4. ASUS NUC 14 Pro — Quietest, Slowest at 8B ($799)

Configuration tested: Core Ultra 7 155H, 32GB DDR5-5600, 1TB NVMe Gen4

The good:

  • The quietest box in the test (33 dBA under load) — genuinely silent at idle
  • 8W idle — lowest among the 8B-capable boxes
  • Intel NPU available via OpenVINO for compatible workloads
  • ASUS firmware support is the best of any vendor here

The bad:

  • 24% slower than AMD on Llama 3.1 8B because Intel Arc SYCL backend is less mature
  • $799 with only single-channel-tested RAM stick on some SKUs — verify before purchase
  • NPU is largely unused by Ollama today

Verdict: Buy this if your office requires silent operation and you can live with 8.7 tok/s on 8B models. For Phi-3 and smaller it is plenty fast.

Configuration tested: Intel N100, 16GB DDR5-4800, 500GB NVMe

The good:

  • $189 with RAM and SSD included
  • Idle at 6W — nothing else even comes close
  • Runs Phi-3 Mini at usable speeds (~8 tok/s)

The bad:

  • 2.1 tok/s on Llama 3.1 8B is too slow for interactive use
  • 16GB RAM ceiling on most SKUs — single SODIMM
  • iGPU offers zero meaningful inference acceleration

Verdict: Don't buy this for Ollama at 8B. Buy it as a Home Assistant box that occasionally runs Phi-3 for voice commands.


Which One Should You Actually Buy? {#verdict}

Use caseRecommendation
Single-user 8B coding assistant on a deskBeelink SER8
Silent office, willing to use Phi-3 / 7BASUS NUC 14 Pro
Want to add an eGPU laterMinisforum UM890 Pro (OCuLink)
Tightest budget that still runs 8BGMKtec NucBox K8 Plus
Voice assistant / smart home onlyBeelink Mini S13 (N100)
Multi-user team boxNone — buy a tower with a real GPU

If you want to run a 13B or 70B model, no mini PC at this price point is the right tool. Get a used RTX 3090 build instead.


Setup: Ollama on a Fresh Mini PC {#setup}

The path that gets you fastest tokens on AMD iGPU boxes is Vulkan offload. Here is the exact recipe on Ubuntu 24.04:

# 1. Update kernel and install Mesa/Vulkan
sudo apt update && sudo apt -y full-upgrade
sudo apt install -y mesa-vulkan-drivers vulkan-tools

# 2. Confirm the iGPU is visible
vulkaninfo --summary | head -30

# 3. Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# 4. Tell Ollama to use Vulkan
sudo systemctl edit ollama.service
# add:
# [Service]
# Environment="OLLAMA_NUM_GPU=999"
# Environment="OLLAMA_VULKAN=1"

sudo systemctl daemon-reload
sudo systemctl restart ollama

# 5. Pull and run
ollama pull llama3.1:8b-instruct-q4_K_M
ollama run llama3.1:8b-instruct-q4_K_M --verbose "Hello"

You should see eval rate ~10-12 tokens/sec on the AMD trio. If you see ~3-4 tok/s, Ollama is running on CPU only — re-check Vulkan visibility.

For Intel NUC 14 Pro (Arc iGPU), use Ollama's IPEX path instead:

sudo apt install -y intel-opencl-icd intel-level-zero-gpu
clinfo | grep "Device Name"   # confirm Arc shows up
# Set in service file:
# Environment="ONEAPI_DEVICE_SELECTOR=level_zero:gpu"

Once running, validate with our benchmark playbook so you can confirm your numbers match this article.


Common Pitfalls {#pitfalls}

1. Single-channel RAM

Some SKUs ship with a single 32GB SODIMM. Inference is memory-bandwidth-bound. Insist on 2x16GB or 2x32GB dual-channel.

2. Buying 16GB and "upgrading later"

With 16GB, the OS and Ollama leave you ~10GB for models. That barely fits Llama 3.1 8B Q4. Skip the regret cycle and buy 32GB up front.

3. NVMe heat throttling

Mini PCs are tight. The NVMe slot is often unventilated. Loading a 70B model from a thermal-throttling SSD turns into minutes of wait. Use a low-profile NVMe heatsink if your case allows it.

4. Wi-Fi-only deployments

Pulling a 40GB model over Wi-Fi is painful. Plug into Ethernet for the initial pull, then go wireless if you must.

5. Underestimating cooling

A mini PC pinned to 95% CPU + iGPU load for 30 minutes will heat-soak. Performance can drop 15-20% after the first 10 minutes. Bench the steady-state number, not the first run.

6. Forgetting to disable suspend

Default GNOME settings will suspend the machine after 20 minutes of inactivity, killing background Ollama. Disable suspend on all server-style mini PCs:

sudo systemctl mask sleep.target suspend.target hibernate.target hybrid-sleep.target

Frequently Asked Questions {#faq}

Q: Can a mini PC really run Llama 3.1 8B at usable speeds?

A: Yes — the AMD 8845HS / 8945HS class hits 11-12 tokens/sec on Q4_K_M, which is faster than most people read. Larger models (13B, 70B) are not realistic on iGPU.

Q: AMD or Intel mini PC for Ollama?

A: AMD wins today. RDNA 3 (Radeon 780M) plus the Vulkan backend gives 25-30% more tokens/sec than Intel Core Ultra Arc on Ollama. That gap will close as Intel's SYCL backend matures, but as of Ollama 0.3.x, AMD is the practical pick.

Q: Do I need a discrete GPU if I have a mini PC?

A: Not for Phi-3 / Llama 3.1 8B class workloads. If you need 13B or larger, or want concurrent users, yes — at that point look at our eGPU benchmarks or a tower build.

Q: How much RAM do I need in a mini PC for Ollama?

A: 32GB is the sweet spot. 16GB barely fits 8B models with a long context. 64GB lets you run 13B or keep two 8B models warm.

Q: Can I run multiple users from a mini PC?

A: One concurrent user comfortably. Two users will see TTFT spike past 3 seconds. For real multi-user, you need a discrete GPU and vLLM — see our Ollama rate limiting guide.

Q: How loud are these under load?

A: 33-44 dBA at 30 cm. The ASUS NUC 14 Pro is desk-quiet. The GMKtec is the loudest. The Beelink SER8 is in the middle — audible but not distracting at typing volume.

Q: Does the NPU help with Ollama?

A: Not yet. Both Intel and AMD NPUs are excellent for vision and audio models, but Ollama and llama.cpp do not currently dispatch transformer inference to NPUs. Treat the NPU as a future-proofing bonus, not a feature you can use today.

Q: Can I add a GPU to these mini PCs?

A: The Minisforum UM890 Pro has OCuLink for a native eGPU connection. The Beelink SER8 has USB4 (40 Gbps) which works with most Thunderbolt eGPU enclosures at a 20-30% throughput tax. The N100 is too I/O-limited to bother.


Conclusion

The mini PC market in 2026 is the best it has ever been for local AI. A $639 Beelink SER8 runs Llama 3.1 8B faster than a 2023 desktop with discrete graphics, in a chassis the size of a paperback, drawing less power than a laptop charger. That is genuinely new.

Start with the SER8 unless you have a specific reason to deviate (silence — NUC 14 Pro; eGPU plans — UM890 Pro; budget — GMKtec). Pair it with our Ollama setup guide and you will be running local AI within an hour of the box arriving.


Want our updated mini PC benchmark sheet whenever a new chassis lands? Subscribe to the LocalAimaster newsletter — we publish full numbers within 72 hours of every relevant release.

🎯
AI Learning Path

Go from reading about AI to building with AI

10 structured courses. Hands-on projects. Runs on your machine. Start free.

Enjoyed this? There are 10 full courses waiting.

10 complete AI courses. From fundamentals to production. Everything runs on your hardware.

Reading now
Join the discussion

LocalAimaster Research Team

Creator of Local AI Master. I've built datasets with over 77,000 examples and trained AI models from scratch. Now I help people achieve AI independence through local AI mastery.

Build Real AI on Your Machine

RAG, agents, NLP, vision, and MLOps - chapters across 10 courses that take you from reading about AI to building AI.

Want structured AI education?

10 courses, 160+ chapters, from $9. Understand AI, don't just use it.

AI Learning Path

Comments (0)

No comments yet. Be the first to share your thoughts!

📅 Published: January 22, 2026🔄 Last Updated: April 23, 2026✓ Manually Reviewed
PR

Written by Pattanaik Ramswarup

Creator of Local AI Master

I build Local AI Master around practical, testable local AI workflows: model selection, hardware planning, RAG systems, agents, and MLOps. The goal is to turn scattered tutorials into a structured learning path you can follow on your own hardware.

✓ Local AI Curriculum✓ Hands-On Projects✓ Open Source Contributor

Was this helpful?

Mini PC Benchmark Updates

We re-test every relevant chassis launch within 72 hours and email subscribers the numbers before they hit the article.

Related Guides

Continue your local AI journey with these comprehensive guides

Build Real AI on Your Machine

RAG, agents, NLP, vision, and MLOps - chapters across 10 courses that take you from reading about AI to building AI.

Continue Learning

📚
Free · no account required

Grab the AI Starter Kit — career roadmap, cheat sheet, setup guide

No spam. Unsubscribe with one click.

🎯
AI Learning Path

Go from reading about AI to building with AI

10 structured courses. Hands-on projects. Runs on your machine. Start free.

Free Tools & Calculators