HIPAA-Compliant Local AI: Healthcare Setup Guide
Want to go deeper than this article?
The AI Learning Path covers this topic and more — hands-on chapters across 10 courses across 10 courses.
HIPAA-Compliant Local AI: The Healthcare Setup Guide
Published on April 11, 2026 -- 18 min read
The core problem: your clinical staff wants to use AI for note summarization, discharge letter drafting, and literature queries. Every cloud AI service requires sending Protected Health Information (PHI) over the internet to servers you do not control. That is a HIPAA violation waiting to happen.
The fix is straightforward. Run the AI on your own hardware. PHI never leaves your network. No Business Associate Agreement needed. No third-party risk assessment. No hoping that OpenAI's data handling meets your compliance requirements.
This guide walks through the full stack: hardware, encryption, access control, audit logging, model selection, and the boundaries of what AI should and should not do in a clinical setting.
What HIPAA Actually Requires for AI Systems {#what-hipaa-requires}
HIPAA does not mention artificial intelligence anywhere in the regulation. What it does mandate are safeguards around Protected Health Information. Any system that touches PHI — including an AI that processes clinical text — must satisfy these requirements:
The Security Rule (Technical Safeguards)
| Safeguard | Requirement | How Local AI Satisfies It |
|---|---|---|
| Access Control | Unique user IDs, emergency access, automatic logoff, encryption | Open WebUI with LDAP + session timeouts |
| Audit Controls | Record and examine access activity | Prompt/response logging with user identity |
| Integrity | Protect PHI from improper alteration | Append-only logs, checksums |
| Transmission Security | Encrypt PHI in transit | TLS between client and Ollama |
| Authentication | Verify user identity | LDAP/Active Directory integration |
The Privacy Rule (Minimum Necessary Standard)
You cannot dump an entire patient record into an AI prompt when you only need a medication list. The minimum necessary standard requires that PHI use be limited to what is needed for the task.
In practice, this means:
- Prompt templates that extract only relevant sections from records
- Role-based access so nurses query nursing-relevant data, not billing codes
- No bulk processing of patient records without a documented purpose
Physical Safeguards
The server running AI must be in a controlled area. A locked server room with badge access and visitor logs. Not a closet. Not under someone's desk.
Why Cloud AI Fails HIPAA by Default {#why-cloud-ai-fails}
When you type a patient's symptoms into ChatGPT, here is what happens:
- PHI leaves your network over HTTPS to OpenAI's servers
- OpenAI processes it on shared infrastructure
- The prompt may be logged, cached, or used for model training (depending on tier)
- You have no control over data retention or geographic location
Even with an enterprise BAA, you are trusting a third party with your most sensitive data. If they have a breach, you share liability. If they change their terms, you scramble to reassess.
Local AI eliminates the entire chain. PHI goes from your workstation to your server over your network. Nothing leaves the building.
For a deeper look at data sovereignty with local models, see the local AI privacy guide.
The Compliant Local AI Stack {#the-compliant-stack}
Here is the architecture that satisfies HIPAA technical, administrative, and physical safeguards:
+---------------------------+
| Clinician Workstation |
| (Browser -> Open WebUI) |
+----------+----------------+
| TLS 1.3 (internal network)
v
+----------+----------------+
| Nginx Reverse Proxy |
| (TLS termination, auth) |
+----------+----------------+
|
v
+----------+----------------+
| Open WebUI |
| (User mgmt, LDAP, RBAC) |
+----------+----------------+
|
v
+----------+----------------+
| Ollama Server |
| (Model inference, no |
| external network access)|
+----------+----------------+
|
v
+----------+----------------+
| LUKS-encrypted volume |
| (Models, logs, all data) |
+---------------------------+
Hardware Requirements
For a small clinic (5-15 concurrent users):
| Component | Specification | Why |
|---|---|---|
| CPU | AMD EPYC 7313 or Xeon Gold 5315Y | Handles request queuing |
| GPU | NVIDIA RTX 4090 (24GB VRAM) | Runs 70B Q4 models at ~35 tok/s |
| RAM | 64GB ECC DDR5 | OS + Open WebUI + buffer |
| Storage | 2TB NVMe (LUKS encrypted) | Models (~40GB) + logs + headroom |
| Network | Dual NIC (management + data) | Separate admin access from user traffic |
For a larger deployment, see the homelab AI server build guide and scale with multiple GPUs.
Encryption at Rest: LUKS Full-Disk Setup {#encryption-at-rest}
Every byte on the AI server's data volume must be encrypted. LUKS (Linux Unified Key Setup) handles this at the block device level, meaning Ollama, Open WebUI, and the logging system do not need to know about encryption — it is transparent.
Creating an Encrypted Volume
# Install cryptsetup if not present
sudo apt install cryptsetup -y
# Create encrypted partition (this destroys existing data)
sudo cryptsetup luksFormat /dev/sdb1 --type luks2 --cipher aes-xts-plain64 --key-size 512
# Open the encrypted volume
sudo cryptsetup luksOpen /dev/sdb1 ai_data
# Create filesystem
sudo mkfs.ext4 /dev/mapper/ai_data
# Mount
sudo mkdir -p /mnt/ai_data
sudo mount /dev/mapper/ai_data /mnt/ai_data
# Set Ollama to use this volume
echo 'OLLAMA_MODELS=/mnt/ai_data/ollama/models' | sudo tee -a /etc/environment
Auto-Unlock with TPM (Optional)
For servers with TPM 2.0, you can bind the LUKS key to the TPM so the volume unlocks automatically on boot — but only on that specific hardware:
# Enroll TPM2 as a LUKS key slot
sudo systemd-cryptenroll --tpm2-device=auto --tpm2-pcrs=0+7 /dev/sdb1
If someone pulls the drive and puts it in another machine, it stays encrypted.
Encryption in Transit: TLS Configuration {#encryption-in-transit}
Even on an internal network, HIPAA requires transmission security. Configure Nginx as a TLS-terminating reverse proxy in front of Open WebUI:
server {
listen 443 ssl http2;
server_name ai.clinic.internal;
ssl_certificate /etc/ssl/certs/ai-clinic.pem;
ssl_certificate_key /etc/ssl/private/ai-clinic.key;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384;
ssl_prefer_server_ciphers on;
# HSTS header
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
location / {
proxy_pass http://127.0.0.1:3000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# WebSocket support for streaming responses
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
}
# Redirect HTTP to HTTPS
server {
listen 80;
server_name ai.clinic.internal;
return 301 https://$host$request_uri;
}
For internal CAs, generate a certificate with your organization's PKI. Do not use self-signed certificates — browsers will throw warnings and staff will learn to ignore security prompts.
User Authentication: Open WebUI + LDAP {#user-authentication}
Every person who interacts with the AI system needs a unique, authenticated identity. No shared accounts. No "everyone uses admin."
Open WebUI with LDAP/Active Directory
# docker-compose.yml excerpt for Open WebUI with LDAP
services:
open-webui:
image: ghcr.io/open-webui/open-webui:main
ports:
- "127.0.0.1:3000:8080"
environment:
- WEBUI_AUTH=true
- ENABLE_LDAP=true
- LDAP_SERVER_HOST=ldap://dc.clinic.internal
- LDAP_SERVER_PORT=389
- LDAP_SEARCH_BASE=ou=staff,dc=clinic,dc=internal
- LDAP_SEARCH_FILTERS=(&(objectClass=person)(memberOf=cn=ai-users,ou=groups,dc=clinic,dc=internal))
- LDAP_BIND_DN=cn=ai-svc,ou=service-accounts,dc=clinic,dc=internal
- LDAP_BIND_PASSWORD_FILE=/run/secrets/ldap_password
- OLLAMA_BASE_URL=http://ollama:11434
- DEFAULT_USER_ROLE=user
- ENABLE_SIGNUP=false
volumes:
- /mnt/ai_data/open-webui:/app/backend/data
secrets:
- ldap_password
restart: unless-stopped
ollama:
image: ollama/ollama:latest
volumes:
- /mnt/ai_data/ollama:/root/.ollama
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
# No port binding — only accessible via Open WebUI
restart: unless-stopped
secrets:
ldap_password:
file: /mnt/ai_data/secrets/ldap_password.txt
Key security decisions in this configuration:
- Ollama has no exposed ports. It is only reachable from the Open WebUI container via Docker networking. No one can bypass the UI to hit the API directly.
- Signup is disabled. Users must exist in Active Directory.
- Group-based access. Only members of the
ai-usersgroup can authenticate. - Secrets are file-mounted, not environment variables (which show up in
docker inspect).
For the full Docker + Open WebUI setup walkthrough, see the Ollama + Open WebUI Docker guide.
Session Timeout
Configure Open WebUI to automatically log users out after 15 minutes of inactivity. A clinician who walks away from a terminal should not leave an active AI session with PHI on screen.
Audit Trail: Logging Every Interaction {#audit-trail}
HIPAA requires you to "implement hardware, software, and procedural mechanisms that record and examine activity in information systems that contain or use electronic protected health information." For an AI system, that means every prompt and every response gets logged with user identity.
Structured Logging Setup
Create a logging wrapper that captures all interactions:
#!/bin/bash
# /mnt/ai_data/scripts/audit-logger.sh
# Runs as a sidecar, tailing Open WebUI's chat logs
LOG_DIR="/mnt/ai_data/audit-logs"
CURRENT_LOG="${LOG_DIR}/ai-audit-$(date +%Y-%m).jsonl"
# Ensure log directory exists with restrictive permissions
mkdir -p "${LOG_DIR}"
chmod 700 "${LOG_DIR}"
# Generate daily integrity checksum
generate_checksum() {
local yesterday
yesterday=$(date -d "yesterday" +%Y-%m-%d 2>/dev/null || date -v-1d +%Y-%m-%d)
local logfile="${LOG_DIR}/ai-audit-$(date +%Y-%m).jsonl"
if [ -f "${logfile}" ]; then
sha256sum "${logfile}" >> "${LOG_DIR}/checksums.txt"
fi
}
# Append-only: set immutable attribute on previous day's entries
seal_previous_logs() {
find "${LOG_DIR}" -name "ai-audit-*.jsonl" -mtime +1 -exec chattr +a {} \;
}
What Gets Logged
Each log entry should contain:
{
"timestamp": "2026-04-11T14:32:17Z",
"user_id": "jsmith",
"user_role": "nurse_practitioner",
"department": "internal_medicine",
"action": "chat_message",
"model": "llama3.3:70b-q4_K_M",
"prompt_hash": "sha256:a1b2c3...",
"prompt_length_chars": 847,
"response_length_chars": 1203,
"session_id": "sess_8f3k2m",
"client_ip": "10.0.50.23"
}
Notice that the log stores a hash of the prompt, not the full text. This creates an audit trail proving who used the system and when, without creating a second copy of PHI in the log files. If an investigation requires the full prompt, it can be retrieved from Open WebUI's database with the session ID.
Retention
HIPAA requires documentation retention for 6 years. Configure log rotation to archive but never delete within that window:
# /etc/logrotate.d/ai-audit
/mnt/ai_data/audit-logs/ai-audit-*.jsonl {
monthly
rotate 72
compress
delaycompress
notifempty
missingok
create 0600 root root
}
Network Isolation {#network-isolation}
The AI server should not have internet access. Period. This eliminates the possibility of PHI exfiltration via a compromised model or software vulnerability.
Firewall Rules
# Allow inbound from clinic network only
sudo ufw default deny incoming
sudo ufw default deny outgoing
sudo ufw allow from 10.0.50.0/24 to any port 443 proto tcp
# Allow SSH from admin VLAN only
sudo ufw allow from 10.0.99.0/24 to any port 22 proto tcp
# Block all outbound internet access
# (models are loaded via USB or air-gapped transfer)
sudo ufw enable
Model Loading Without Internet
Since the server cannot reach the internet, load models via an air-gapped transfer:
# On an internet-connected workstation:
ollama pull llama3.3:70b-instruct-q4_K_M
# Copy the model blob to a USB drive
cp -r ~/.ollama/models /media/usb_drive/ollama_models/
# On the air-gapped AI server:
sudo cp -r /media/usb_drive/ollama_models/* /mnt/ai_data/ollama/models/
sudo chown -R root:root /mnt/ai_data/ollama/models/
Model Selection for Healthcare {#model-selection}
Not every model handles medical text well. Here is what we have tested with de-identified clinical notes:
| Model | Size | VRAM Needed | Clinical Text Quality | Speed (RTX 4090) |
|---|---|---|---|---|
| Llama 3.3 70B Q4_K_M | 40GB | 24GB + CPU offload | Excellent — understands medical terminology, ICD codes, medication names | ~25 tok/s |
| Llama 3.3 8B Q6_K | 6.6GB | 8GB | Good for summaries, misses nuance in complex cases | ~85 tok/s |
| Qwen 2.5 72B Q4_K_M | 42GB | 24GB + CPU offload | Strong on structured medical data, lab interpretation | ~22 tok/s |
| Mistral Large 2 Q4_K_M | 68GB | 48GB (dual GPU) | Excellent reasoning, good differential diagnosis discussion | ~18 tok/s |
| Phi-3 Medium 14B | 8.2GB | 10GB | Acceptable for simple tasks, struggles with complex clinical language | ~60 tok/s |
Clinical Prompt Template
Structured prompts produce far better results than free-form questions:
You are a clinical documentation assistant. You help healthcare professionals
summarize and organize medical information. You do NOT diagnose, recommend
treatments, or make clinical decisions.
TASK: Summarize the following clinical note into a structured format.
REQUIRED SECTIONS:
- Chief Complaint (1 sentence)
- History of Present Illness (3-5 sentences)
- Relevant Past Medical History
- Current Medications
- Assessment Summary
CLINICAL NOTE:
[paste de-identified note here]
What AI Cannot Do in Healthcare {#what-ai-cannot-do}
This is the section that protects you legally and clinically.
AI as assistant, never as decision-maker. Under current FDA guidance, any AI system that diagnoses disease, recommends treatment, or triages patients is classified as a Software as Medical Device (SaMD). That requires:
- FDA 510(k) clearance or De Novo classification
- Clinical validation studies
- Quality management system (QMS)
- Post-market surveillance
Your local Ollama setup is none of these. It is a text processing tool. Use it for:
- Summarizing long clinical notes
- Drafting discharge instructions (reviewed by clinician before delivery)
- Searching medical literature
- Generating structured reports from unstructured text
- Training and education scenarios with synthetic data
Do not use it for:
- Diagnostic suggestions ("Based on these symptoms, the patient likely has...")
- Treatment recommendations ("The recommended medication is...")
- Triage decisions ("This patient should be seen within 4 hours")
- Billing code selection without human verification
The EU AI Act compliance guide covers additional regulatory frameworks that may apply if your organization operates internationally.
BAA Considerations: Why Self-Hosted Means No BAA {#baa-considerations}
A Business Associate Agreement is required when a third party handles PHI on behalf of a covered entity. The key word is "third party."
When you run Ollama on your own hardware:
- Your IT staff maintains the server (they are workforce members, not business associates)
- No external company processes, stores, or transmits PHI
- No cloud provider touches the data
- No model provider receives your prompts
Result: no BAA is needed for the AI component.
You still need BAAs for other vendors (your EHR, your ISP if they can access PHI, your backup provider if they handle PHI). But the AI layer is entirely within your covered entity's control.
This is a significant advantage over cloud AI solutions, where the BAA negotiation alone can take months, legal review costs thousands of dollars, and the resulting agreement often limits your liability recourse.
Implementation Checklist {#implementation-checklist}
Before going live, verify every item:
Technical Controls
- LUKS encryption on all data volumes (verify with
sudo cryptsetup status ai_data) - TLS 1.2+ on all network connections (test with
openssl s_client -connect ai.clinic.internal:443) - Ollama port not exposed externally (verify with
nmap -p 11434 <server-ip>) - LDAP authentication working with individual accounts
- Session timeout configured (15 minutes recommended)
- Automatic logoff on terminal lock
- Audit logging capturing all interactions
- Log integrity checksums generating daily
- Firewall blocking all outbound internet traffic
- USB ports disabled or controlled (prevent unauthorized data extraction)
Administrative Controls
- AI acceptable use policy written and signed by all users
- Training completed: what PHI can and cannot be entered
- Incident response plan updated to include AI system
- Risk assessment documented and filed
- Minimum necessary guidelines published for AI prompts
- Regular access review schedule established (quarterly recommended)
Physical Controls
- Server in locked room with badge access
- Visitor log maintained for server room
- Environmental controls (temperature, humidity, fire suppression)
- UPS/backup power for graceful shutdown
Ongoing Compliance: What to Monitor {#ongoing-compliance}
Setting up the system is half the work. Maintaining compliance requires ongoing vigilance:
Weekly: Review audit logs for anomalous access patterns. Look for off-hours usage, unusually large prompts (someone pasting entire records), or access from unexpected departments.
Monthly: Verify encryption status, check for failed login attempts, review user access lists against current staff roster, apply security patches.
Quarterly: Conduct risk assessment review, test backup restoration, verify log retention compliance, update AI acceptable use policy if needed.
Annually: Full HIPAA security audit, penetration test of the AI system, staff retraining, review of model performance and appropriateness.
For monitoring infrastructure, the Ubuntu AI workstation setup guide covers Prometheus and Grafana configuration that adapts well to healthcare monitoring requirements.
Conclusion
Running AI locally is not just a nice-to-have for healthcare — it is the only architecture that gives you full control over PHI handling. Cloud AI creates a permanent dependency on third-party compliance that you cannot fully verify. Local AI puts compliance back in your hands.
The setup is not trivial. Encryption, access control, audit logging, network isolation, and ongoing monitoring all require real engineering work. But the result is an AI assistant that your clinical staff can use without creating regulatory exposure — and without sending patient data to anyone else's servers.
Start with a single use case (clinical note summarization is the easiest win), prove the value, then expand. And remember: the AI assists. The clinician decides.
Want to build the infrastructure behind this guide? Start with the Ollama + Open WebUI Docker setup, then layer on the security controls described here.
Go from reading about AI to building with AI
10 structured courses. Hands-on projects. Runs on your machine. Start free.
Enjoyed this? There are 10 full courses waiting.
10 complete AI courses. From fundamentals to production. Everything runs on your hardware.
Build Real AI on Your Machine
RAG, agents, NLP, vision, MLOps — chapters across 10 courses that take you from reading about AI to building AI.
Want structured AI education?
10 courses, 160+ chapters, from $9. Understand AI, don't just use it.
Continue Your Local AI Journey
Comments (0)
No comments yet. Be the first to share your thoughts!