Air-Gapped AI Deployment: Complete Offline Setup Guide (2026)
Want to go deeper than this article?
The AI Learning Path covers this topic and more — hands-on chapters across 10 courses across 10 courses.
Air-Gapped AI Deployment: Reference Architecture for Offline Networks
Published on April 23, 2026 - 22 min read
Quick Decision: Is Your Network Actually Air-Gapped?
"Air-gapped" gets used loosely. For this guide, the definition is: no routable network path between the AI environment and any internet-connected system at any time. Not "we use a firewall." Not "we use a VPN." Physically isolated, with sneakernet (USB or one-way data diode) as the only data transfer mechanism.
If your environment matches that definition - SCIF, classified network, regulated medical infrastructure, OT/SCADA isolation, financial trading vault, or a customer's internal "no egress" enclave - this guide is for you. If you have a private network with controlled internet access, see the local AI privacy guide instead.
For those who do need true air-gap: the entire reference architecture, transfer procedures, model verification chain, and audit-trail tooling are below.
What this guide covers:
- Reference architecture for air-gapped AI environments
- Building an offline package mirror that does not require internet during install
- Verifying model integrity through SHA-256 chains and signed manifests
- Sneakernet workflow for moving models, container images, and updates
- Certificate management without ACME or external CAs
- Audit trail requirements for compliance frameworks (FedRAMP High, ISO 27001, HIPAA, SOC 2)
- Update cadence and the cost of staleness on an air-gap
True air-gapped AI deployments come from three places: defense and intelligence environments where any network egress is a security violation; regulated industries (finance, healthcare, energy) where data residency rules forbid cloud or internet exposure; and customer enclaves where the AI must run inside a network that does not trust your supply chain. All three have similar architecture but different audit and compliance overlays.
The good news: Ollama, Open WebUI, llama.cpp, and most of the local-AI ecosystem run fine in air-gap. The hard parts are the operational discipline - keeping the install reproducible, the models verifiable, the updates timely, and the audit trail intact.
For broader context on why local AI matters for security, see the local AI privacy guide and AI hardware requirements guide. For deployment patterns that do allow some egress, the Synology NAS setup is a softer privacy posture.
Table of Contents
- Threat Model and Definition
- Reference Architecture
- Building the Bootstrap Bundle
- Air-Gap Transfer Procedure
- Installing Ollama Offline
- Model Verification Chain
- Open WebUI in Air-Gap
- Certificate Management Without ACME
- Audit Trail Requirements
- Update Cadence and Staleness
- Pitfalls and Fixes
Threat Model and Definition {#threat-model}
A real air-gapped environment assumes that any data leaving the network represents a security breach. The threat model includes:
- Egress to model providers (HuggingFace, Ollama Registry, GitHub releases) is forbidden
- DNS lookups to internet hosts are forbidden
- NTP from internet pool is forbidden (use a local stratum-1 source)
- Even ICMP responses to traceroute are forbidden in some classifications
- Operating system telemetry must be disabled at the kernel level
- USB transfers are logged, signed, and audited
- Clipboard, screenshot, and file copy actions may be monitored
What you control:
- Hardware acquisition through an audited supply chain
- A "low side" workstation with internet that can download artifacts
- A "high side" environment that is the actual air-gapped network
- A controlled transfer mechanism (USB, optical disc, one-way data diode)
The "low side / high side" terminology comes from defense networks. Same pattern applies to financial vault networks, OT/SCADA isolation, and regulated healthcare environments.
Reference Architecture {#architecture}
A canonical air-gapped AI deployment has four logical zones:
Zone 1 - Low side staging. Internet-connected workstation used to download artifacts. Never touches the air-gap network. Hosts the artifact mirror you assemble.
Zone 2 - Transfer media. Write-once or write-protected media (DVD-R, signed USB stick, one-way diode). Transports vetted artifacts from Zone 1 to Zone 3.
Zone 3 - High side ingest. Air-gapped workstation that receives artifacts from Zone 2, validates signatures, computes hashes, registers them in an internal artifact repository.
Zone 4 - Production AI nodes. Inference servers running Ollama. Pull from the internal artifact repo only. Never see Zone 1 or Zone 2 directly.
[Internet] -> [Zone 1: Low Side] -> [Zone 2: USB/DVD] -> [Zone 3: Ingest] -> [Zone 4: Production]
|
+-- one-way only, audited
This architecture is what FedRAMP High and ICD 503 environments require. Less-strict regulated environments may collapse Zones 3 and 4, but the separation of low side and high side is non-negotiable.
Building the Bootstrap Bundle {#bootstrap-bundle}
Everything the air-gap network needs to install and run AI must be assembled into a single bundle on the low side. A typical bundle contains:
| Artifact | Source | Verify With |
|---|---|---|
| Ollama Linux binary | github.com/ollama/ollama/releases | SHA-256 from release page |
| Open WebUI Docker image | ghcr.io/open-webui/open-webui | Cosign signature |
| Llama 3.1 8B Q4 model | ollama.com/library/llama3.1 | SHA-256 from registry |
| Phi-3 Mini Q4 model | ollama.com/library/phi3 | SHA-256 from registry |
| Embedding model (nomic-embed-text) | ollama.com/library/nomic-embed-text | SHA-256 |
| Linux kernel updates | distro mirror | GPG signature |
| nginx reverse proxy | distro package | GPG signature |
| Internal CA certificate | your PKI | Out-of-band trust |
Build the bundle on a freshly imaged Linux workstation. Commands below assume Ubuntu 22.04 LTS on the low side.
# Set up working directory
mkdir -p ~/airgap-bundle/{binaries,images,models,packages,signatures,manifest}
cd ~/airgap-bundle
# Pull Ollama
curl -L https://github.com/ollama/ollama/releases/latest/download/ollama-linux-amd64 \
-o binaries/ollama-linux-amd64
sha256sum binaries/ollama-linux-amd64 > signatures/ollama.sha256
# Pull Open WebUI Docker image and save as tarball
docker pull ghcr.io/open-webui/open-webui:main
docker save ghcr.io/open-webui/open-webui:main \
-o images/open-webui.tar
sha256sum images/open-webui.tar > signatures/open-webui.sha256
# Pull models via Ollama (running locally on low side)
ollama pull llama3.1:8b
ollama pull phi3:mini
ollama pull nomic-embed-text
# Copy model blobs from ~/.ollama/models to bundle
cp -r ~/.ollama/models models/
sha256sum -b $(find models -type f) > signatures/models.sha256
# Pull distro packages for offline apt
apt-get download nginx-full ca-certificates apparmor apparmor-utils
mv *.deb packages/
# Generate the bundle manifest
cat > manifest/bundle.json <<EOF
{
"bundle_version": "$(date +%Y%m%d)",
"creator": "$(whoami)",
"creation_date": "$(date -u +%Y-%m-%dT%H:%M:%SZ)",
"low_side_host": "$(hostname)",
"artifacts": {
"ollama_binary": "$(sha256sum binaries/ollama-linux-amd64 | cut -d' ' -f1)",
"open_webui_image": "$(sha256sum images/open-webui.tar | cut -d' ' -f1)",
"models_directory_hash": "$(find models -type f -exec sha256sum {} \; | sort | sha256sum | cut -d' ' -f1)"
}
}
EOF
# Sign the manifest with your operator key
gpg --armor --detach-sign manifest/bundle.json
# Final tarball
tar czf airgap-bundle-$(date +%Y%m%d).tar.gz \
binaries/ images/ models/ packages/ signatures/ manifest/
sha256sum airgap-bundle-*.tar.gz > airgap-bundle-final.sha256
The result is a single signed tarball plus a SHA-256 hash. Both go on transfer media.
Air-Gap Transfer Procedure {#transfer}
The transfer step is the most-audited part of any air-gap workflow. Three mechanisms in declining order of security:
One-way data diode. Hardware that physically allows data flow in only one direction. Commercial diodes from Owl Cyber Defense or Forcepoint cost $20K-150K. Required for some classified environments. Out of scope for most readers.
Signed write-once media (DVD-R, BD-R). Burned on the low side, hash-verified on the high side. Cannot be modified after burn. Cheap, auditable, slow (40 minutes for a 25GB BD-R). Best practical option for most organizations.
Vetted USB sticks. USB stick with hardware write-protection switch (Kanguru FlashTrust or Apricorn Aegis) flipped to read-only after writing on the low side. Stick is logged in/out of the secure facility. Faster than optical, cheaper than diode.
The procedure on the high side:
# Mount the transfer media read-only
sudo mount -o ro /dev/sr0 /mnt/transfer # for optical
# or
sudo mount -o ro /dev/sdc1 /mnt/transfer # for USB
# Verify outer hash matches what was provided out-of-band
sha256sum /mnt/transfer/airgap-bundle-*.tar.gz
# Verify GPG signature on bundle manifest
cd /tmp
mkdir verify && cd verify
tar xzf /mnt/transfer/airgap-bundle-*.tar.gz
gpg --verify manifest/bundle.json.asc manifest/bundle.json
# Confirm individual artifact hashes against manifest
sha256sum -c signatures/*.sha256
# If all checks pass, copy to internal artifact repo
sudo cp -r * /srv/airgap-repo/$(date +%Y%m%d)/
# Log the transfer
echo "$(date -u +%Y-%m-%dT%H:%M:%SZ) BUNDLE_INGEST $(whoami) bundle_$(date +%Y%m%d) sha256=$(sha256sum /mnt/transfer/airgap-bundle-*.tar.gz | cut -d' ' -f1)" \
| sudo tee -a /var/log/airgap-transfers.log
Every action gets logged with timestamp, operator, and artifact hash. This log is what auditors will ask for.
Installing Ollama Offline {#install-ollama}
The Ollama install script (curl https://ollama.com/install.sh | sh) cannot be used in air-gap because it tries to contact ollama.com. Manual install:
# Create system user
sudo useradd --system --shell /bin/false --home /var/lib/ollama --create-home ollama
# Install binary
sudo cp /srv/airgap-repo/latest/binaries/ollama-linux-amd64 /usr/local/bin/ollama
sudo chmod 755 /usr/local/bin/ollama
# Create data directory
sudo mkdir -p /var/lib/ollama
sudo chown ollama:ollama /var/lib/ollama
# Copy verified models into Ollama's data dir
sudo cp -r /srv/airgap-repo/latest/models/* /var/lib/ollama/
# Create systemd unit
sudo tee /etc/systemd/system/ollama.service <<'EOF'
[Unit]
Description=Ollama AI Service
After=network-online.target
[Service]
Type=simple
User=ollama
Group=ollama
ExecStart=/usr/local/bin/ollama serve
Environment="OLLAMA_MODELS=/var/lib/ollama/models"
Environment="OLLAMA_HOST=127.0.0.1:11434"
Environment="OLLAMA_KEEP_ALIVE=30m"
# Hardening
NoNewPrivileges=yes
PrivateTmp=yes
ProtectSystem=strict
ProtectHome=yes
ReadWritePaths=/var/lib/ollama
Restart=on-failure
RestartSec=3
[Install]
WantedBy=multi-user.target
EOF
sudo systemctl daemon-reload
sudo systemctl enable --now ollama.service
# Verify
sudo systemctl status ollama
ollama list
Setting OLLAMA_HOST=127.0.0.1:11434 ensures Ollama only listens on loopback. Reverse proxy (next section) handles external access.
Model Verification Chain {#model-verification}
In a regulated environment, you need to prove every model file came from a verified source and was not modified in transit. The chain:
- At Ollama Registry (low side): Each model blob has a SHA-256 hash visible in
~/.ollama/models/manifests/. - In bundle:
signatures/models.sha256lists every model blob hash. - On transfer media: Outer tarball hash matches what was provided out-of-band.
- On high side ingest:
sha256sum -cvalidates every blob against the manifest. - On production node: When pulling models from internal repo, hashes are re-validated.
- On Ollama load: Ollama itself validates blob integrity via its internal manifest before serving.
For the strictest environments, add a digital signature step where your organization's PKI signs the model manifest after high-side validation. Production nodes verify both the original Ollama hash and your organization's signature before loading.
A practical signing flow:
# On high side, after validating bundle
cd /srv/airgap-repo/latest/models
# Sign the manifest with your operator/CISO key
gpg --output models.sig --detach-sig --armor manifests/registry.ollama.ai/library/llama3.1/8b
# Embed signature in production deployment
sudo cp models.sig /var/lib/ollama/models/manifests/registry.ollama.ai/library/llama3.1/8b.sig
This gives you a chain that any auditor can replay.
Open WebUI in Air-Gap {#open-webui}
Open WebUI ships as a Docker image. In an air-gap, you load it from the saved tarball:
# Load image
sudo docker load -i /srv/airgap-repo/latest/images/open-webui.tar
# Verify it loaded
sudo docker images | grep open-webui
# Run with auth enabled and offline configuration
sudo docker run -d \
--name open-webui \
--restart always \
-p 127.0.0.1:3000:8080 \
-v open-webui-data:/app/backend/data \
-e OLLAMA_BASE_URL=http://host.docker.internal:11434 \
-e WEBUI_AUTH=true \
-e ENABLE_SIGNUP=false \
-e ENABLE_OPENAI_API=false \
-e DEFAULT_USER_ROLE=user \
--add-host=host.docker.internal:host-gateway \
ghcr.io/open-webui/open-webui:main
Critical environment variables for air-gap:
ENABLE_OPENAI_API=falseprevents any code path that might try to reach OpenAIENABLE_SIGNUP=falseso only admin-created accounts workOLLAMA_BASE_URL=http://host.docker.internal:11434keeps the connection internal
Open WebUI does have telemetry features in some versions. Audit docker logs open-webui after first start to confirm no outbound network attempts.
For deployment hardening details, see the Open WebUI setup guide which covers SSO, RBAC, and audit log integration.
Certificate Management Without ACME {#certificates}
Let's Encrypt and ACME require internet access. In air-gap, you run an internal CA. Three workable approaches:
1. Step-CA (smallstep). Modern internal CA with ACME-compatible interface. Runs inside the air-gap network. Issues certs to internal hosts via ACME exactly like Let's Encrypt does on the public internet, just with your internal CA as the trust anchor.
2. cfssl or Easy-RSA. Simpler, script-based PKI. Better for static infrastructure where certs change rarely.
3. Active Directory Certificate Services. If your air-gap network already has AD, ADCS issues certs through Group Policy. Adds Windows dependencies but integrates cleanly with existing AD-joined hosts.
Step-CA quick setup:
# Install Step-CA from packages in your bundle
sudo dpkg -i /srv/airgap-repo/latest/packages/step-ca*.deb step-cli*.deb
# Initialize a new CA (one-time, on the CA host)
step ca init \
--name "AcmeCorp Air-Gap CA" \
--dns ca.airgap.internal \
--address ":443" \
--provisioner admin@airgap.internal
# Run the CA
sudo systemctl enable --now step-ca
# On AI server, request a cert
step ca certificate ai.airgap.internal \
/etc/nginx/ssl/ai.crt /etc/nginx/ssl/ai.key \
--provisioner admin@airgap.internal
Distribute the CA's root certificate via your normal endpoint management (GPO, MDM, manual install). Once endpoints trust the internal root, internal HTTPS works exactly like public HTTPS.
Audit Trail Requirements {#audit}
Compliance frameworks (FedRAMP High, ISO 27001, HIPAA, SOC 2) all require audit trails for AI systems. The minimum events to log:
| Event Type | Source | Retention |
|---|---|---|
| Model load | Ollama API logs | 7 years |
| User authentication | Open WebUI / SSO | 1 year |
| Prompt and response (full content) | Open WebUI DB | varies (HIPAA: 6 years) |
| Admin configuration changes | sudo + auditd | 7 years |
| Bundle ingest events | /var/log/airgap-transfers.log | indefinite |
| Failed login attempts | sshd, Open WebUI | 1 year |
| Certificate issuance | Step-CA | indefinite |
For HIPAA-class data, every prompt and response must be retrievable for the patient or auditor. Open WebUI stores conversations in SQLite by default. For compliance, switch to PostgreSQL with WAL archiving and disable the user-delete option.
For a deeper audit-trail walkthrough, the dedicated guide covers schema design and retention strategy.
Update Cadence and Staleness {#updates}
The hardest operational reality: an air-gap network runs old software. Your install was current the day the bundle was built; everything degrades from there.
Recommended cadence:
| Component | Update Cadence | Notes |
|---|---|---|
| Operating system kernel | Monthly | Security patches |
| Ollama binary | Quarterly | Stability over bleeding edge |
| Models | Quarterly to annually | New model releases ARE security-relevant - they fix prompt-injection vulnerabilities |
| Open WebUI | Quarterly | Frequent feature changes |
| TLS certificates | 90 days (internal CA) | Automate renewal via Step-CA ACME |
| CA root | 5-10 years | Plan in advance |
Each update follows the same flow: build a new bundle on the low side, transfer through Zone 2, validate on Zone 3, deploy to Zone 4. The lag from public release to air-gap deployment is typically 2-6 weeks for routine updates and 5-7 days for emergency security patches.
This staleness is not optional. Production AI in air-gap will lag public LLM capabilities by 6-12 months. Plan for this when setting user expectations.
Pitfalls and Fixes {#pitfalls}
Bundle hash mismatches on high side. Bit rot during transfer is real on optical media. Burn two copies and verify both.
Ollama tries to reach ollama.com despite manual install. Some Ollama versions have telemetry enabled by default. Set OLLAMA_NOPRUNE=1 and OLLAMA_SKIP_TELEMETRY=1 (or whatever environment variable the current version uses - check release notes).
Model not found after copy. Ollama expects models in /var/lib/ollama/models/manifests/ and /var/lib/ollama/models/blobs/. Direct file copy preserves structure; ollama pull from a registry does not work in air-gap.
Open WebUI shows "checking for updates" messages. Open WebUI has an update-check feature that hits ghcr.io. Block at the host firewall and disable in settings UI after first login.
TLS certs expire and break LAN access overnight. Use Step-CA ACME for automated renewal. Manual cert management always fails on the long timeline of an air-gap deployment.
Audit logs filling disk. Configure logrotate from day one. Models and conversations grow fast; audit logs grow slowly but never get pruned.
Unable to load Docker image because of missing CA bundle. When Open WebUI or any container tries to verify a TLS connection internally, it needs your CA root in its trust store. Mount your CA cert into the container at /etc/ssl/certs/ca-certificates.crt or rebuild the image with the cert baked in.
New model architecture not supported by old Ollama. Llama 3.3 may not load on a 6-month-old Ollama binary. Plan Ollama updates ahead of model updates so you do not get stuck mid-bundle.
Sneakernet gets too slow at scale. Once you are pushing 50GB+ per quarterly bundle, optical media becomes painful. Consider a vetted USB-C SSD with hardware write protection (Apricorn Aegis NVX or similar) instead of multiple BD-R discs.
Frequently Asked Questions
Q: Can I run Ollama with no internet at all, after install?
A: Yes. Once Ollama is installed and models are loaded, it does not require internet for inference. Telemetry can be disabled. The only network-dependent operations are pulling new models, which you handle via the sneakernet bundle.
Q: How big is a typical air-gap bundle?
A: A bare-bones bundle with Ollama, Phi-3 Mini, Llama 3.1 8B, and an embedding model is roughly 6-8GB. Add Open WebUI image (1.5GB) and OS package updates (500MB-2GB). Quarterly bundles tend toward 10-15GB.
Q: What about FedRAMP and DoD STIG compliance?
A: Standard Linux STIG hardening applies (disable root SSH, enforce password complexity, AIDE for file integrity). Open WebUI and Ollama do not have FedRAMP-validated builds; you provide compensating controls (audit logging, network segmentation, endpoint hardening).
Q: Can air-gapped Ollama receive models without sneakernet?
A: Only via a one-way data diode. There are no other workable mechanisms that maintain the air-gap property. Some organizations build private RAG using LAN-internal data sources without ever needing new models.
Q: How do I update an air-gapped GPU driver safely?
A: Same bundle process. Download the NVIDIA .run installer or distro driver package on the low side, hash and sign, sneakernet to the high side. Test on a non-production node first; GPU driver updates are higher-risk than CPU-only updates.
Q: Is HuggingFace usable in air-gap?
A: HuggingFace's web UI is not, but you can download model files via the low side and transport them. Many HuggingFace models work directly in Ollama via custom Modelfile entries. Plan your model strategy around models that have been quantized to GGUF format.
Q: What about telemetry on RHEL or Ubuntu in air-gap?
A: Ubuntu Pro and RHEL both have phone-home features that fail loudly when blocked. Configure offline subscription manager (RHEL) or disable Ubuntu Pro features. Consider Rocky Linux or Debian for true air-gap deployments to avoid the telemetry surface.
Q: Can I run Whisper or Stable Diffusion air-gapped?
A: Yes, using the same bundle pattern. Whisper.cpp and ComfyUI are popular choices. Stable Diffusion models are large (5-10GB per checkpoint) so plan transfer media size accordingly.
Conclusion
Air-gapped AI deployment is an operational discipline as much as a technical setup. The technology works fine - Ollama, Open WebUI, and the rest of the local AI stack run cleanly without internet. The work is in the bundle process, the verification chain, the audit trail, and the operational rhythm of quarterly transfers.
For most organizations contemplating "should we air-gap our AI?", the honest answer is: only if your threat model genuinely requires it. The cost in update lag, operational overhead, and user-experience degradation is real. If your environment merely needs strong privacy without true isolation, the local AI privacy guide covers a lighter posture that achieves most of the same protections.
For organizations that do need true air-gap - SCIF environments, classified networks, regulated medical infrastructure, financial vault networks - this architecture has been deployed successfully in environments where any data egress is a security incident.
External references: NIST SP 800-53 controls AC-4 (Information Flow Enforcement) and SC-7 (Boundary Protection) in NIST Special Publication 800-53 Rev. 5, and Owl Cyber Defense's data diode reference architecture.
Want more security and compliance-focused AI guides? Join the LocalAIMaster newsletter for weekly deployment patterns and audit walkthroughs.
Go from reading about AI to building with AI
10 structured courses. Hands-on projects. Runs on your machine. Start free.
Enjoyed this? There are 10 full courses waiting.
10 complete AI courses. From fundamentals to production. Everything runs on your hardware.
Build Real AI on Your Machine
RAG, agents, NLP, vision, and MLOps - chapters across 10 courses that take you from reading about AI to building AI.
Want structured AI education?
10 courses, 160+ chapters, from $9. Understand AI, don't just use it.
Continue Your Local AI Journey
Comments (0)
No comments yet. Be the first to share your thoughts!