Why should I not use the NVIDIA .run file installer on Ubuntu?

The .run file installs the driver outside of apt's package management system. When Ubuntu updates the kernel, the driver breaks because it was not installed through apt and cannot be automatically rebuilt. The PPA method integrates with apt so drivers update automatically alongside kernel updates, preventing black screens after system updates.

How do I switch between multiple CUDA versions on Ubuntu?

Install multiple CUDA toolkits side by side (they install to /usr/local/cuda-XX.X/) and switch by changing your PATH environment variable or using update-alternatives. The NVIDIA driver supports all CUDA versions up to its maximum. For example, driver version 560 supports CUDA 12.6 and all earlier versions.

How do I access my Ubuntu AI workstation remotely?

SSH for terminal access, Jupyter Lab at port 8888 for interactive notebooks, and Open WebUI at port 3000 for AI chat. For access outside your local network, install Tailscale for a zero-configuration VPN mesh network. Never expose Ollama, Jupyter, or Open WebUI ports directly to the internet without authentication.

What is the minimum GPU for an Ubuntu AI workstation?

An NVIDIA GTX 1060 6GB is the practical minimum for GPU-accelerated inference with small models (3B-7B parameters). For useful everyday AI work, an RTX 3060 12GB or better is recommended. The RTX 3090 with 24GB VRAM offers the best value for serious AI development, running models up to 33B parameters at high quality.

How much swap space do I need for AI workloads on Ubuntu?

We recommend 32GB of swap space. When loading large models (40GB+ for 70B parameter models), the system temporarily needs extra memory for both old and new model states. Without sufficient swap, the Linux OOM killer terminates Ollama or your inference process. Swap is only used during brief loading spikes; actual inference runs from RAM and VRAM.

Ubuntu AI Workstation: Complete Setup Guide

Q: Do I need Ubuntu Desktop or Ubuntu Server for an AI workstation?

Use Desktop if you work at the machine with a monitor, or Server if you access it remotely via SSH. Server saves approximately 900MB RAM by skipping the GUI, giving slightly more room for model loading. You can install a desktop environment on Server later with 'sudo apt install ubuntu-desktop' if needed.

Q: Can I use Fedora or Arch Linux instead of Ubuntu for AI?

You can, but NVIDIA driver installation is more fragile on both. Fedora's kernel update cycle sometimes breaks NVIDIA drivers, and Arch's rolling releases require manual driver management. Ubuntu LTS with the graphics-drivers PPA provides the most reliable NVIDIA driver experience for AI workstations. It is also the primary target for CUDA testing.

Q: Should I use a headless setup without a GUI for maximum AI performance?

Headless saves about 900MB RAM and 100-200MB VRAM that the display server uses. For a dedicated inference server, headless is clearly better. For a development workstation where you write code, browse documentation, and use Jupyter in a local browser, the GUI convenience outweighs the small resource savings.

Published on April 10, 2026 • 24 min read

I have configured Ubuntu AI workstations on everything from a spare office PC with a GTX 1080 to a dual-RTX 4090 rack server. The process is the same every time: install Ubuntu, get NVIDIA drivers working without breaking X11, set up Docker with GPU passthrough, and install the AI stack. It takes about 90 minutes if you know the right steps, and days if you stumble into the common pitfalls.

This guide is the 90-minute version. Every command tested on Ubuntu 24.04 LTS with NVIDIA GPUs ranging from GTX 1060 to RTX 5090. No unnecessary detours, no desktop customization fluff. Just a working AI workstation.

What you will have after this guide:

Ubuntu 24.04 LTS with NVIDIA drivers and CUDA toolkit
Docker with NVIDIA Container Toolkit for GPU-accelerated containers
Ollama running as a system service with network access
Python environment (conda + venv) with PyTorch and Jupyter
System monitoring tools (nvitop, btop, nvidia-smi)
Auto-start services and security hardening
A clear understanding of why each component matters

If you already have Linux running and need model-specific configuration, check the Linux local AI setup guide which assumes an existing system. This guide starts from a fresh install.

Why Ubuntu for AI
Fresh Install Optimization
NVIDIA Driver Installation
CUDA Toolkit Setup
Docker and NVIDIA Container Toolkit
Ollama Installation and Configuration
Python Environment
Jupyter Lab Setup
Monitoring Tools
Auto-Start Services
Security Hardening
Comparison: Ubuntu vs Windows vs Mac for AI

Reading articles is good. Building is better.

Free account = 20+ free chapters across 20 courses, with a per-chapter AI tutor. No card. Cancel anytime if you ever upgrade.

Start free in 30 seconds See pricing

Why Ubuntu for AI {#why-ubuntu}

There are practical reasons Ubuntu dominates AI workstations, not just tradition:

NVIDIA CUDA support is first-class. NVIDIA develops and tests CUDA on Ubuntu before any other distribution. Driver packages land in Ubuntu repos within days of release. When something breaks (and GPU drivers do break), the fix appears for Ubuntu first. NVIDIA's official CUDA documentation uses Ubuntu for all examples.

Docker is native. No hypervisor layer like Docker Desktop on Windows or Mac. GPU passthrough works with a simple --gpus all flag. Container performance matches bare metal.

systemd manages everything. Ollama, Open WebUI, Jupyter, monitoring crons. All managed through a single, consistent init system with automatic restart, logging, and dependency management.

Package ecosystem. apt + pip + conda cover every AI library. PyTorch, TensorFlow, JAX, vLLM, llama.cpp, Triton, DeepSpeed: all tested primarily on Ubuntu.

Resource efficiency. A headless Ubuntu Server install uses ~350MB RAM at idle. Ubuntu Desktop uses ~1.2GB. Windows 11 uses ~3-4GB. On a machine with 32GB RAM and a 24GB GPU, that 2-3GB difference means one more model layer in system memory during CPU offloading.

Which Ubuntu Version?

Use Ubuntu 24.04 LTS. The LTS (Long Term Support) release gets security updates until 2029 and is the target for NVIDIA driver testing. Do not use 24.10 or any interim release for a production workstation. Non-LTS releases go end-of-life in 9 months and driver compatibility is not guaranteed.

Fresh Install Optimization {#fresh-install}

Download and Install

Download Ubuntu 24.04 LTS from ubuntu.com. Choose "Ubuntu Desktop" if you want a GUI, or "Ubuntu Server" if this machine will be headless.

# Flash ISO to USB (from another Linux/Mac machine)
sudo dd if=ubuntu-24.04.2-desktop-amd64.iso of=/dev/sdX bs=4M status=progress oflag=sync

# Or use balenaEtcher on any OS

During installation:

Partitioning: Use "Erase disk and install Ubuntu" with LVM. LVM lets you resize partitions later without data loss.
Swap: The installer creates a swap file by default. We will adjust its size later.
SSH: If installing Server edition, enable OpenSSH during install.
Third-party drivers: Do NOT check "Install third-party drivers" during install. We will install NVIDIA drivers manually for better control.

Post-Install Essentials

# Update everything
sudo apt update && sudo apt upgrade -y

# Install essential build tools
sudo apt install -y \
  build-essential \
  git \
  curl \
  wget \
  htop \
  vim \
  tmux \
  net-tools \
  software-properties-common \
  apt-transport-https \
  ca-certificates \
  gnupg \
  lsb-release \
  unzip \
  jq

# Set timezone
sudo timedatectl set-timezone America/New_York  # adjust to yours

# Configure swap (important for large model loading)
# Check current swap
swapon --show

# Increase swap to 32GB (helps with model loading spikes)
sudo swapoff -a
sudo fallocate -l 32G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile

# Make permanent
echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab

# Reduce swappiness (prefer RAM over swap)
echo 'vm.swappiness=10' | sudo tee -a /etc/sysctl.conf
sudo sysctl -p

Why 32GB Swap?

When loading a large model (40GB+ for 70B Q4), the system temporarily needs memory for both the old and new model states. Without enough swap, the OOM killer terminates Ollama. With 32GB swap on top of your physical RAM, model loading succeeds even when memory is tight. Swap is only used during these brief loading spikes; actual inference runs entirely from RAM and VRAM.

NVIDIA Driver Installation {#nvidia-drivers}

This is the single most error-prone step in any Linux AI setup. Do it wrong and you get a black screen on boot. Here is the reliable method.

Step 1: Blacklist Nouveau (Open-Source NVIDIA Driver)

Nouveau conflicts with the proprietary NVIDIA driver. Block it before installation.

# Create blacklist file
sudo bash -c 'cat > /etc/modprobe.d/blacklist-nouveau.conf << EOF
blacklist nouveau
options nouveau modeset=0
EOF'

# Regenerate initramfs
sudo update-initramfs -u

# Reboot
sudo reboot

Step 2: Install the NVIDIA Driver

# Add the graphics drivers PPA
sudo add-apt-repository ppa:graphics-drivers/ppa -y
sudo apt update

# Check which driver is recommended for your GPU
ubuntu-drivers devices

# You will see output like:
# nvidia-driver-560 - distro non-free recommended
# nvidia-driver-555 - distro non-free

# Install the recommended driver
sudo apt install -y nvidia-driver-560

# Reboot
sudo reboot

Step 3: Verify

# Check driver is loaded
nvidia-smi

# Expected output:
# +-----------------------------------------------------------------------------+
# | NVIDIA-SMI 560.35.03    Driver Version: 560.35.03    CUDA Version: 12.6    |
# |-------------------------------+----------------------+----------------------+
# | GPU  Name        Persistence  | Bus-Id        Disp.A | Volatile Uncorr. ECC |
# |===============================+======================+======================|
# |   0  NVIDIA GeForce RTX 4090  | 00000000:01:00.0 On  |                  N/A |
# +-------------------------------+----------------------+----------------------+

# Verify kernel module
lsmod | grep nvidia
# Should show nvidia, nvidia_modeset, nvidia_uvm, nvidia_drm

Common Failure: Black Screen After Reboot

If you get a black screen after installing NVIDIA drivers:

# Boot into recovery mode (hold Shift during boot → Advanced → Recovery)
# Or press Ctrl+Alt+F2 for a TTY console

# Remove broken driver
sudo apt purge nvidia-* -y
sudo apt autoremove -y

# Reinstall
sudo ubuntu-drivers autoinstall
sudo reboot

Prevention: Always use ubuntu-drivers autoinstall or the PPA method. Never download .run files from nvidia.com. The .run installer does not integrate with apt and causes conflicts on every kernel update.

Reading articles is good. Building is better.

Free account = 20+ free chapters across 20 courses, with a per-chapter AI tutor. No card. Cancel anytime if you ever upgrade.

Start free in 30 seconds See pricing

CUDA Toolkit Setup {#cuda-toolkit}

The NVIDIA driver includes a CUDA runtime, but the full CUDA toolkit adds the compiler (nvcc), libraries, and headers needed for building GPU-accelerated applications from source.

# Download CUDA toolkit (network installer)
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt update

# Install CUDA toolkit (matches your driver's CUDA version)
sudo apt install -y cuda-toolkit-12-6

# Add to PATH
echo 'export PATH=/usr/local/cuda/bin:$PATH' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc
source ~/.bashrc

# Verify
nvcc --version
# Should show: Cuda compilation tools, release 12.6

cuDNN Installation (For PyTorch/TensorFlow)

# Install cuDNN
sudo apt install -y libcudnn9-cuda-12 libcudnn9-dev-cuda-12

# Verify
dpkg -l | grep cudnn
# Should show libcudnn9 packages

Verify Full CUDA Stack

# Quick verification script
cat << 'EOF' > /tmp/cuda_test.py
import torch
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"CUDA version: {torch.version.cuda}")
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"VRAM: {torch.cuda.get_device_properties(0).total_mem / 1e9:.1f} GB")

    # Test computation
    x = torch.randn(1000, 1000, device='cuda')
    y = torch.randn(1000, 1000, device='cuda')
    z = torch.mm(x, y)
    print(f"GPU compute test: PASSED ({z.shape})")
EOF
python3 /tmp/cuda_test.py

Docker and NVIDIA Container Toolkit {#docker-setup}

Docker with GPU passthrough is essential for running AI applications in isolated, reproducible environments. Open WebUI, vLLM, text-generation-inference, and most production AI tools ship as Docker containers.

Install Docker Engine

# Remove old Docker versions
sudo apt remove docker docker-engine docker.io containerd runc 2>/dev/null

# Add Docker's official GPG key and repository
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg

echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

sudo apt update
sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

# Add your user to docker group (avoid sudo for every docker command)
sudo usermod -aG docker $USER

# Apply group change (or log out and back in)
newgrp docker

# Verify
docker run hello-world

Install NVIDIA Container Toolkit

# Add NVIDIA container toolkit repo
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | \
  sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
  sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
  sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

sudo apt update
sudo apt install -y nvidia-container-toolkit

# Configure Docker to use NVIDIA runtime
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

# Verify GPU access in Docker
docker run --rm --gpus all nvidia/cuda:12.6.0-base-ubuntu24.04 nvidia-smi
# Should show your GPU(s) with full VRAM

Docker Compose with GPU

For multi-container setups, here is how to pass GPUs in docker-compose.yml:

# docker-compose.yml
services:
  ollama:
    image: ollama/ollama
    ports:
      - "11434:11434"
    volumes:
      - ollama_data:/root/.ollama
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]
    restart: always

  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    ports:
      - "3000:8080"
    volumes:
      - webui_data:/app/backend/data
    environment:
      - OLLAMA_BASE_URL=http://ollama:11434
    depends_on:
      - ollama
    restart: always

volumes:
  ollama_data:
  webui_data:

# Start the stack
docker compose up -d

# Check status
docker compose ps

For a comprehensive walkthrough of the Ollama + Open WebUI Docker deployment, see the Docker setup guide.

Ollama Installation and Configuration {#ollama-setup}

Install Ollama

# One-line install (recommended)
curl -fsSL https://ollama.com/install.sh | sh

# This installs the binary and creates a systemd service
# Verify
ollama --version
systemctl status ollama

Configure for Workstation Use

# Edit the systemd service for custom configuration
sudo systemctl edit ollama

# Add between the comment blocks:
[Service]
Environment="OLLAMA_HOST=0.0.0.0:11434"
Environment="OLLAMA_ORIGINS=*"
Environment="OLLAMA_KEEP_ALIVE=4h"
Environment="OLLAMA_NUM_PARALLEL=2"
Environment="OLLAMA_MAX_LOADED_MODELS=2"

# Reload and restart
sudo systemctl daemon-reload
sudo systemctl restart ollama

# Verify it is listening on all interfaces
ss -tlnp | grep 11434

Configuration explained:

OLLAMA_HOST=0.0.0.0 — Listen on all network interfaces (for Open WebUI and remote access)
OLLAMA_ORIGINS=* — Allow CORS from any origin (needed for web interfaces)
OLLAMA_KEEP_ALIVE=4h — Keep models loaded in VRAM for 4 hours after last request
OLLAMA_NUM_PARALLEL=2 — Handle 2 concurrent requests
OLLAMA_MAX_LOADED_MODELS=2 — Keep up to 2 models in memory simultaneously

Pull Essential Models

# General purpose
ollama pull llama3.2:7b
ollama pull qwen2.5:14b

# Code generation
ollama pull qwen2.5-coder:14b
ollama pull codellama:7b

# Verify
ollama list

Test GPU Acceleration

# Run a model and check GPU usage
ollama run llama3.2:7b "What is Ubuntu?" &

# In another terminal, check GPU utilization
nvidia-smi
# GPU-Util should be >0% during inference
# Memory-Usage should show the model loaded

# Check Ollama's GPU detection
ollama ps
# Should show the model and "100% GPU" or similar

Python Environment {#python-environment}

A clean Python environment prevents dependency conflicts between different AI projects. Use conda for environment isolation and pip within each environment for packages.

Install Miniconda

# Download Miniconda
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

# Install (non-interactive)
bash Miniconda3-latest-Linux-x86_64.sh -b -p $HOME/miniconda3

# Initialize
~/miniconda3/bin/conda init bash
source ~/.bashrc

# Disable auto-activation of base environment
conda config --set auto_activate_base false

Create AI Development Environment

# Create environment with Python 3.11 (best PyTorch compatibility)
conda create -n ai python=3.11 -y
conda activate ai

# Install PyTorch with CUDA support
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126

# Install common AI libraries
pip install \
  transformers \
  accelerate \
  datasets \
  sentencepiece \
  protobuf \
  bitsandbytes \
  peft \
  trl \
  wandb \
  einops \
  scipy \
  scikit-learn \
  pandas \
  matplotlib

# Verify PyTorch CUDA
python -c "import torch; print(f'CUDA: {torch.cuda.is_available()}, GPU: {torch.cuda.get_device_name(0)}')"

Create Separate Environments for Conflicting Projects

# Environment for faster-whisper (speech-to-text)
conda create -n whisper python=3.11 -y
conda activate whisper
pip install faster-whisper sounddevice numpy

# Environment for fine-tuning
conda create -n finetune python=3.11 -y
conda activate finetune
pip install torch transformers accelerate peft trl bitsandbytes datasets

# Environment for vLLM (production inference)
conda create -n vllm python=3.11 -y
conda activate vllm
pip install vllm

Jupyter Lab Setup {#jupyter-setup}

Jupyter Lab is essential for interactive experimentation with models, data analysis, and prototyping AI pipelines.

Installation

# Activate your AI environment
conda activate ai

# Install Jupyter Lab
pip install jupyterlab ipywidgets

# Generate config
jupyter lab --generate-config

# Set password (more secure than tokens)
jupyter lab password

Configure for Remote Access

# Edit Jupyter config
vim ~/.jupyter/jupyter_lab_config.py

# Add these settings:
# c.ServerApp.ip = '0.0.0.0'
# c.ServerApp.port = 8888
# c.ServerApp.open_browser = False
# c.ServerApp.allow_remote_access = True

# Or set via command line:
jupyter lab --ip=0.0.0.0 --port=8888 --no-browser

Run Jupyter as a Service

# Create systemd service
sudo bash -c 'cat > /etc/systemd/system/jupyter.service << EOF
[Unit]
Description=Jupyter Lab
After=network.target

[Service]
Type=simple
User='$USER'
WorkingDirectory=/home/'$USER'
ExecStart=/home/'$USER'/miniconda3/envs/ai/bin/jupyter lab --ip=0.0.0.0 --port=8888 --no-browser
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target
EOF'

sudo systemctl daemon-reload
sudo systemctl enable jupyter
sudo systemctl start jupyter

# Access at http://YOUR_IP:8888

Monitoring Tools {#monitoring-tools}

You cannot optimize what you cannot measure. These tools show exactly what your workstation is doing.

nvitop (Best GPU Monitor)

pip install nvitop

# Run it
nvitop

# Shows per-GPU:
# - Utilization percentage
# - Memory used/total
# - Temperature
# - Power draw
# - Per-process GPU memory usage
# - Fan speed

nvitop is better than nvidia-smi for monitoring because it updates in real-time, shows per-process breakdowns, and has a clean TUI layout. It is the first thing I install on any AI workstation.

btop (System Monitor)

sudo apt install -y btop

# Run it
btop

# Shows CPU, RAM, disk I/O, network, and process tree
# in a beautiful, interactive terminal interface

nvidia-smi Watch Mode

# Update every second (lightweight, good for scripts)
watch -n 1 nvidia-smi

# Or use nvidia-smi's built-in loop
nvidia-smi -l 1

# Query specific metrics (useful for logging)
nvidia-smi --query-gpu=timestamp,name,temperature.gpu,utilization.gpu,utilization.memory,memory.used,memory.total,power.draw \
  --format=csv -l 5 | tee gpu_log.csv

Automated Health Check Script

cat << 'HEALTH' > ~/ai_health.sh
#!/bin/bash
echo "======================================"
echo "AI Workstation Health Check"
echo "$(date)"
echo "======================================"
echo ""
echo "--- GPU Status ---"
nvidia-smi --query-gpu=name,temperature.gpu,utilization.gpu,memory.used,memory.total,power.draw \
  --format=csv,noheader
echo ""
echo "--- Ollama Status ---"
echo "Service: $(systemctl is-active ollama)"
echo "Models loaded:"
curl -s http://localhost:11434/api/ps | python3 -m json.tool 2>/dev/null || echo "  (none or unreachable)"
echo ""
echo "--- Docker Containers ---"
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}" 2>/dev/null || echo "  Docker not running"
echo ""
echo "--- System Resources ---"
echo "RAM: $(free -h | awk '/Mem/{print $3"/"$2}')"
echo "Swap: $(free -h | awk '/Swap/{print $3"/"$2}')"
echo "Disk: $(df -h / | awk 'NR==2{print $3"/"$2" ("$5" used)"}')"
echo "Uptime: $(uptime -p)"
echo "Load: $(cat /proc/loadavg | awk '{print $1, $2, $3}')"
HEALTH
chmod +x ~/ai_health.sh

# Run it
~/ai_health.sh

# Add to cron for daily logging
(crontab -l 2>/dev/null; echo "0 9 * * * ~/ai_health.sh >> ~/ai_health_log.txt 2>&1") | crontab -

Auto-Start Services {#auto-start}

After a reboot, everything should come up automatically without manual intervention.

# Ollama (already enabled from install script)
sudo systemctl enable ollama

# Docker (already enabled from install)
sudo systemctl enable docker

# Ensure Docker containers restart
# (use --restart always when creating containers)
docker update --restart always open-webui 2>/dev/null

# NVIDIA persistence mode (keeps GPU initialized)
sudo nvidia-smi -pm 1

# Make persistence mode survive reboot
sudo bash -c 'cat > /etc/systemd/system/nvidia-persistence.service << EOF
[Unit]
Description=NVIDIA Persistence Daemon
After=nvidia.service

[Service]
Type=oneshot
ExecStart=/usr/bin/nvidia-smi -pm 1
RemainAfterExit=yes

[Install]
WantedBy=multi-user.target
EOF'
sudo systemctl daemon-reload
sudo systemctl enable nvidia-persistence

# GPU power limit (if you set one for noise/power savings)
sudo bash -c 'cat > /etc/systemd/system/nvidia-powerlimit.service << EOF
[Unit]
Description=NVIDIA GPU Power Limit
After=nvidia-persistence.service

[Service]
Type=oneshot
ExecStart=/usr/bin/nvidia-smi -pl 300
RemainAfterExit=yes

[Install]
WantedBy=multi-user.target
EOF'
sudo systemctl daemon-reload
sudo systemctl enable nvidia-powerlimit

Verify Everything Survives Reboot

# Reboot
sudo reboot

# After reboot, check everything:
nvidia-smi                    # GPU detected, persistence mode on
systemctl status ollama       # Active (running)
docker ps                     # Containers running
curl http://localhost:11434   # Ollama responding
~/ai_health.sh                # Full health check

Security Hardening {#security}

An AI workstation with network-accessible services needs basic security. This is not enterprise hardening, but it stops the obvious attack vectors.

UFW Firewall

# Install and configure UFW
sudo ufw default deny incoming
sudo ufw default allow outgoing

# Allow SSH (essential!)
sudo ufw allow ssh

# Allow services on LAN only
sudo ufw allow from 192.168.0.0/16 to any port 11434  # Ollama
sudo ufw allow from 192.168.0.0/16 to any port 3000   # Open WebUI
sudo ufw allow from 192.168.0.0/16 to any port 8888   # Jupyter

# Enable firewall
sudo ufw enable

# Verify
sudo ufw status verbose

Fail2Ban

# Install
sudo apt install -y fail2ban

# Create local config
sudo bash -c 'cat > /etc/fail2ban/jail.local << EOF
[sshd]
enabled = true
port = ssh
filter = sshd
logpath = /var/log/auth.log
maxretry = 3
bantime = 3600
findtime = 600
EOF'

sudo systemctl enable fail2ban
sudo systemctl restart fail2ban

# Check banned IPs
sudo fail2ban-client status sshd

SSH Key-Only Authentication

# On your local machine, generate a key (if you haven't already)
ssh-keygen -t ed25519

# Copy public key to workstation
ssh-copy-id user@workstation-ip

# On the workstation, disable password authentication
sudo sed -i 's/#PasswordAuthentication yes/PasswordAuthentication no/' /etc/ssh/sshd_config
sudo sed -i 's/PasswordAuthentication yes/PasswordAuthentication no/' /etc/ssh/sshd_config
sudo systemctl restart sshd

Automatic Security Updates

# Install unattended upgrades
sudo apt install -y unattended-upgrades

# Enable automatic security updates
sudo dpkg-reconfigure -plow unattended-upgrades
# Select "Yes" when prompted

# Verify
cat /etc/apt/apt.conf.d/20auto-upgrades
# Should show:
# APT::Periodic::Update-Package-Lists "1";
# APT::Periodic::Unattended-Upgrade "1";

Comparison: Ubuntu vs Windows vs Mac for AI {#comparison}

Factor	Ubuntu	Windows	macOS
NVIDIA driver stability	Excellent	Good	N/A (no NVIDIA)
CUDA support	Full, first-class	Full	N/A
Docker GPU passthrough	Native, no overhead	WSL2 layer, ~5% overhead	N/A for NVIDIA
RAM overhead at idle	350MB (Server) / 1.2GB (Desktop)	3-4GB	2-3GB
PyTorch GPU support	Full CUDA	Full CUDA	MPS (Metal, limited)
Multi-GPU support	Full	Full	Single GPU only
Remote access (SSH)	Built-in	Requires setup	Built-in
AI framework testing	Primary target	Secondary	Tertiary
Ease of setup	Medium (this guide helps)	Easy (drivers auto-install)	Easy (no GPU config)
Apple Silicon support	N/A	N/A	Excellent (unified memory)

Bottom line: If you have an NVIDIA GPU, Ubuntu is the best OS for AI. Period. The combination of first-class CUDA support, native Docker, low overhead, and the entire AI ecosystem targeting Ubuntu as the primary platform makes it the clear winner for dedicated AI workstations.

Windows works fine for casual AI use, but the WSL2 layer adds complexity and a small performance penalty. The RAM overhead alone costs you model capacity.

macOS is excellent specifically for Apple Silicon Macs with unified memory. It beats Ubuntu for models that exceed NVIDIA VRAM capacity. See the Apple Silicon AI buying guide for Mac-specific recommendations.

For the AI hardware requirements guide, the OS choice matters less than the GPU and RAM. But on identical hardware, Ubuntu extracts the most performance.

Next Steps

Your Ubuntu AI workstation is configured and secured. Here is what to do next:

Pull models and start experimenting. The best Ollama models guide ranks models by use case and size.
Set up a web interface. Follow the Ollama + Open WebUI Docker guide for a ChatGPT-like experience accessible from any browser on your network.
Understand your VRAM limits. The VRAM requirements guide tells you exactly which models fit on your GPU.

Frequently Asked Questions

Do I need Ubuntu Desktop or Ubuntu Server for an AI workstation?

If you will sit at the machine with a monitor, use Desktop. If you will access it remotely via SSH, use Server. Server saves ~900MB RAM by skipping the GUI, which translates to slightly more room for model loading. You can always install a desktop environment on Server later with sudo apt install ubuntu-desktop.

Can I use Fedora or Arch Linux instead of Ubuntu?

You can, but NVIDIA driver installation is more fragile. Fedora uses a different kernel update cycle that sometimes breaks NVIDIA drivers. Arch rolling releases require manual driver management. Ubuntu LTS with the graphics-drivers PPA is the path of least resistance for AI workstations.

Why not use the NVIDIA .run file installer?

The .run file installs the driver outside of apt's package management. When Ubuntu updates the kernel, the driver breaks and you get a black screen. The PPA method integrates with apt so drivers update automatically alongside kernel updates. This matters on a workstation that needs to stay running.

How do I switch between multiple CUDA versions?

Install multiple CUDA toolkits side by side and use update-alternatives or simply change the PATH. The driver supports all CUDA versions up to its maximum (e.g., driver 560 supports CUDA 12.6 and all earlier versions). Most AI frameworks only need the runtime, not the full toolkit.

Should I use a headless (no GUI) setup for maximum performance?

Headless saves 900MB RAM and a small amount of GPU memory (the display server uses ~100-200MB VRAM). For a dedicated inference server, headless is clearly better. For a development workstation where you also write code, the GUI convenience outweighs the resource savings.

How do I access my workstation remotely?

SSH for terminal access (ssh user@ip), Jupyter Lab at port 8888 for notebooks, and Open WebUI at port 3000 for chat. For access outside your LAN, install Tailscale for a zero-config VPN mesh. Avoid exposing ports directly to the internet.

Conclusion

An Ubuntu AI workstation is the most capable local AI platform you can build. NVIDIA's CUDA ecosystem, Docker's native GPU support, and Ubuntu's stability as the primary target for AI framework testing make it the default choice for serious AI development.

The setup takes about 90 minutes following this guide. The NVIDIA driver installation is the only step where things can go wrong, and the PPA method described here is the safest approach. Once past that hurdle, everything from Docker to Ollama to Jupyter installs cleanly with standard commands.

Build it once, maintain it minimally, and use it for years. Ubuntu 24.04 LTS receives security updates until 2029. Your AI workstation will outlast multiple generations of models.

For hardware selection guidance, see the AI hardware requirements guide. If you are building a dedicated headless AI server instead of a workstation, the homelab server build guide covers the hardware and Ubuntu Server configuration in detail.

Ubuntu AI Workstation: Complete Setup Guide

Want to go deeper than this article?

Table of Contents

Reading articles is good. Building is better.

Why Ubuntu for AI {#why-ubuntu}

Which Ubuntu Version?

Fresh Install Optimization {#fresh-install}

Download and Install

Post-Install Essentials

Why 32GB Swap?

NVIDIA Driver Installation {#nvidia-drivers}

Step 1: Blacklist Nouveau (Open-Source NVIDIA Driver)

Step 2: Install the NVIDIA Driver

Step 3: Verify

Common Failure: Black Screen After Reboot

Reading articles is good. Building is better.

CUDA Toolkit Setup {#cuda-toolkit}

cuDNN Installation (For PyTorch/TensorFlow)

Verify Full CUDA Stack

Docker and NVIDIA Container Toolkit {#docker-setup}

Install Docker Engine

Install NVIDIA Container Toolkit

Docker Compose with GPU

Ollama Installation and Configuration {#ollama-setup}

Install Ollama

Configure for Workstation Use

Pull Essential Models

Test GPU Acceleration

Python Environment {#python-environment}

Install Miniconda

Create AI Development Environment

Create Separate Environments for Conflicting Projects

Jupyter Lab Setup {#jupyter-setup}

Installation

Configure for Remote Access

Run Jupyter as a Service

Monitoring Tools {#monitoring-tools}

nvitop (Best GPU Monitor)

btop (System Monitor)

nvidia-smi Watch Mode

Automated Health Check Script

Auto-Start Services {#auto-start}

Verify Everything Survives Reboot

Security Hardening {#security}

UFW Firewall

Fail2Ban

SSH Key-Only Authentication

Automatic Security Updates

Comparison: Ubuntu vs Windows vs Mac for AI {#comparison}

Next Steps

Frequently Asked Questions

Do I need Ubuntu Desktop or Ubuntu Server for an AI workstation?

Can I use Fedora or Arch Linux instead of Ubuntu?

Why not use the NVIDIA .run file installer?

How do I switch between multiple CUDA versions?

Should I use a headless (no GUI) setup for maximum performance?

How do I access my workstation remotely?

Conclusion

Go from reading about AI to building with AI

Liked this? 20 full AI courses are waiting.

Local AI Master Research Team

Build Real AI on Your Machine

Want structured AI education?

Continue Your Local AI Journey

How to Install Your First Local AI Model

How to Choose the Right AI Model for Your Computer

Comments (0)

Written by the Local AI Master Team

Get Linux AI Tips Weekly

Build Real AI on Your Machine

🎓 Continue Learning

Related Guides

Grab the AI Starter Kit — career roadmap, cheat sheet, setup guide

Go from reading about AI to building with AI