Local AI + Home Assistant: Private Smart Home
Want to go deeper than this article?
The AI Learning Path covers this topic and more — hands-on chapters across 10 courses across 10 courses.
Local AI + Home Assistant: Build a Private Smart Home
Published on April 11, 2026 — 22 min read
My smart home setup records when I wake up, when I leave, when I come back, which rooms I use, what temperature I prefer, and when I go to sleep. That is an intimate behavioral profile. Sending it to Amazon, Google, or Apple's cloud for AI processing is a non-starter.
Home Assistant already runs locally. Adding Ollama gives it natural language understanding and intelligent automation — without leaking your household patterns to cloud servers. I can say "movie night" and the house dims the living room lights, closes the blinds, sets the TV input, and adjusts the thermostat. All processed on a Raspberry Pi 5 sitting next to my router.
This guide covers the complete integration: installing Ollama alongside Home Assistant, configuring the built-in OpenAI-compatible integration, building natural language automations, and creating AI-powered routines for energy and presence detection.
Why Smart Home Data Needs to Stay Local {#why-local}
Smart home data is uniquely sensitive. Unlike browsing history or email (which you consciously create), smart home sensors passively record your physical life:
- Occupancy patterns: Motion sensors reveal when your home is empty — useful for burglars.
- Sleep schedule: Bedroom lights and motion expose when you wake and sleep.
- Health indicators: Bathroom sensor frequency, medication cabinet openings, activity levels.
- Financial patterns: Energy usage correlates with income and lifestyle.
- Relationship data: Who visits, how often, when they leave.
Every cloud-connected smart home assistant (Alexa, Google Home, Apple HomePod) uploads this data for processing. Home Assistant keeps it local by default. Adding Ollama keeps the AI processing local too. For a broader perspective on AI privacy, see our local AI privacy guide.
The Architecture {#architecture}
The stack is straightforward:
- Home Assistant — Runs your smart home (devices, automations, dashboards)
- Ollama — Runs AI models for natural language understanding
- Extended OpenAI Compatible — Home Assistant's built-in integration that connects to any OpenAI-compatible API (including Ollama)
Ollama exposes an OpenAI-compatible API at http://localhost:11434/v1. Home Assistant talks to it using the same protocol it would use for OpenAI — but the traffic never leaves your LAN.
Hardware Options
| Setup | AI Performance | Power Draw | Cost |
|---|---|---|---|
| Raspberry Pi 5 (8GB) running both HA + Ollama | 3B models at 8 tok/s | 12W | $80 |
| Pi 5 (HA) + Mini PC (Ollama) | 7B models at 15 tok/s | 25W total | $200 |
| Pi 5 (HA) + Desktop GPU server | 14B+ models at 35 tok/s | 80W | $500+ |
| Single mini PC running both | 7B models at 12 tok/s | 20W | $300 |
For most homes, a Raspberry Pi 5 with 8GB RAM handles both Home Assistant and a 3B model (Llama 3.2 3B or Phi-3 Mini). Responses take 2-4 seconds, which is fine for voice commands and automations. If you want faster responses or larger models, run Ollama on a separate machine on the same network.
For detailed hardware requirements, check our Ollama system requirements guide. If you are considering a Raspberry Pi setup, our LLM on Raspberry Pi 5 guide covers installation and optimization.
Step 1: Install Ollama {#install-ollama}
On Raspberry Pi 5 (Same Machine as HA)
# SSH into your Home Assistant OS
# If running HA OS, use the Terminal & SSH add-on
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
# Pull a model that fits in 8GB RAM
ollama pull llama3.2:3b
# For Pi 5 with only 4GB, use a smaller model
ollama pull phi3:3.8b
# Verify it's running
ollama list
On a Separate Machine (Recommended for Better Performance)
If you run Home Assistant on a Pi and have a mini PC or desktop with more RAM:
# On the separate machine, install Ollama
curl -fsSL https://ollama.com/install.sh | sh
# Pull a larger model for better natural language understanding
ollama pull llama3.2 # 8B model, needs 6GB RAM
ollama pull qwen2.5:7b # Good for complex automations
# Allow network access from other machines
# Edit Ollama's systemd service
sudo systemctl edit ollama.service
Add this to the override file:
[Service]
Environment="OLLAMA_HOST=0.0.0.0"
sudo systemctl daemon-reload
sudo systemctl restart ollama
# Verify it's accessible from your network
curl http://YOUR_OLLAMA_IP:11434/api/tags
Step 2: Configure Home Assistant Integration {#configure-ha}
Home Assistant includes a built-in integration called "Ollama" that connects directly to your local Ollama instance. As of Home Assistant 2024.7+, this is a first-class integration.
Setup via UI
- Go to Settings → Devices & Services → Add Integration
- Search for "Ollama"
- Enter the Ollama URL:
- Same machine:
http://localhost:11434 - Separate machine:
http://192.168.1.X:11434(use your Ollama machine's IP)
- Same machine:
- Select your model (e.g., llama3.2:3b)
- Click Submit
Setup via configuration.yaml
If you prefer YAML configuration:
# configuration.yaml
conversation:
# This tells HA to use Ollama for the conversation agent
# The Ollama integration is configured via UI,
# but you can set defaults here:
ollama:
Verify the Connection
After adding the integration, go to Developer Tools → Services and call:
- Service:
conversation.process - Data:
{ "text": "What time is it?" }
If Ollama responds, the integration is working.
Step 3: Natural Language Device Control {#voice-control}
With the Ollama conversation agent active, you can control devices using natural language from the Home Assistant Assist dialog.
Basic Commands
Open Assist (the microphone icon in the HA header) and try:
- "Turn on the living room lights"
- "Set the thermostat to 72 degrees"
- "Lock the front door"
- "What is the temperature in the bedroom?"
- "Is the garage door open?"
- "Turn off all lights"
The AI understands context and synonyms. "Kill the lights" works the same as "turn off the lights." "Make it warmer" increases the thermostat by a few degrees.
Exposing Entities to the AI
By default, Home Assistant does not expose every entity to the conversation agent. You need to explicitly choose which devices the AI can see and control:
- Go to Settings → Voice assistants
- Click on your Ollama agent
- Under Exposed entities, select which entities the AI can access
Be selective. Exposing 200 entities makes the AI slower and less accurate. Start with the entities you actually want to control by voice — lights, thermostats, locks, and media players. Add more as needed.
Custom Sentences and Intents
For complex commands, define custom intents:
# custom_sentences/en/movie_night.yaml
language: "en"
intents:
MovieNight:
data:
- sentences:
- "movie night"
- "start movie mode"
- "cinema mode"
- "time for a movie"
# automations.yaml
- alias: "Movie Night Scene"
trigger:
- platform: conversation
command: "MovieNight"
action:
- service: light.turn_on
target:
entity_id: light.living_room
data:
brightness: 30
color_temp_kelvin: 2700
- service: cover.close_cover
target:
entity_id: cover.living_room_blinds
- service: media_player.turn_on
target:
entity_id: media_player.tv
- service: media_player.select_source
target:
entity_id: media_player.tv
data:
source: "HDMI 1"
- service: climate.set_temperature
target:
entity_id: climate.living_room
data:
temperature: 72
Now saying "movie night" triggers a complete scene: lights dim to 30% warm white, blinds close, TV turns on to the right input, and the thermostat adjusts.
Step 4: AI-Powered Automations {#ai-automations}
Beyond simple voice commands, Ollama enables automations that require reasoning — something traditional rule-based automations cannot do.
Energy Optimization
Use AI to analyze your energy usage and make recommendations:
# automations.yaml
- alias: "AI Energy Analysis"
trigger:
- platform: time
at: "06:00:00" # Run every morning
action:
- service: conversation.process
data:
agent_id: conversation.ollama
text: >
Analyze today's energy plan. Current weather forecast shows
{{ states('weather.home') }} with high of
{{ state_attr('weather.home', 'forecast')[0].temperature }}F.
Current electricity rate is {{ states('sensor.electricity_rate') }}/kWh.
Yesterday's total consumption was {{ states('sensor.daily_energy') }} kWh.
Suggest optimal thermostat schedule to minimize cost while
maintaining comfort. Should I pre-cool or pre-heat?
response_variable: energy_advice
- service: notify.mobile_app
data:
title: "Daily Energy Plan"
message: "{{ energy_advice.response.speech.plain.speech }}"
Presence-Based Scene Management
- alias: "AI Welcome Home"
trigger:
- platform: state
entity_id: person.john
from: "not_home"
to: "home"
condition:
- condition: time
after: "17:00:00"
before: "23:00:00"
action:
- service: conversation.process
data:
agent_id: conversation.ollama
text: >
John just arrived home. It's {{ now().strftime('%I:%M %p') }}.
Outside temperature is {{ states('sensor.outdoor_temperature') }}F.
Inside temperature is {{ states('sensor.indoor_temperature') }}F.
The following lights are on: {{ states.light | selectattr('state','eq','on') | map(attribute='entity_id') | list }}.
Based on the time and conditions, what lights should I turn on
and what should the thermostat be set to?
Reply with ONLY a JSON object like:
{"lights": ["light.living_room", "light.kitchen"], "brightness": 80, "thermostat": 71}
response_variable: ai_response
# Parse and execute the AI's suggestion
- service: script.execute_welcome_scene
data:
config: "{{ ai_response.response.speech.plain.speech }}"
Anomaly Detection
- alias: "AI Security Check"
trigger:
- platform: time_pattern
hours: "/1" # Every hour
condition:
- condition: state
entity_id: group.family
state: "not_home"
action:
- service: conversation.process
data:
agent_id: conversation.ollama
text: >
Nobody is home. Check these sensor states for anomalies:
Front door: {{ states('binary_sensor.front_door') }}
Back door: {{ states('binary_sensor.back_door') }}
Garage: {{ states('binary_sensor.garage_door') }}
Living room motion: {{ states('binary_sensor.living_room_motion') }}
Kitchen motion: {{ states('binary_sensor.kitchen_motion') }}
Basement motion: {{ states('binary_sensor.basement_motion') }}
Doorbell last ring: {{ states('sensor.doorbell_last_ring') }}
Is anything unusual? If motion is detected while nobody is home,
respond with "ALERT" followed by the concern. Otherwise respond "OK".
response_variable: security_check
- choose:
- conditions:
- condition: template
value_template: "{{ 'ALERT' in security_check.response.speech.plain.speech }}"
sequence:
- service: notify.mobile_app
data:
title: "Security Alert"
message: "{{ security_check.response.speech.plain.speech }}"
data:
push:
sound: "alarm.caf"
Model Selection for Home Assistant {#model-selection}
Different tasks need different models. Here is what I tested:
| Task | Best Model | Why | Response Time (Pi 5) |
|---|---|---|---|
| Simple voice commands | Llama 3.2 3B | Fast, understands basic intents | 1.5s |
| Complex automations | Qwen 2.5 7B | Better reasoning about conditions | 4.2s |
| JSON output generation | Llama 3.2 3B | Reliable structured output | 1.8s |
| Energy analysis | DeepSeek R1 7B | Strong reasoning about optimization | 5.1s |
| General Q&A about home | Llama 3.2 3B | Good general knowledge | 2.0s |
For a Raspberry Pi 5 with 8 GB, Llama 3.2 3B is the sweet spot — fast enough for real-time voice commands and smart enough for most automation logic. If you run Ollama on a separate machine with 16+ GB RAM, Qwen 2.5 7B handles complex reasoning scenarios noticeably better.
On a Pi 5 with only 4 GB, use Phi-3 Mini 3.8B in Q4 quantization. It fits in memory and responds in about 3 seconds for simple commands.
Switching Models Per Automation
You can configure multiple Ollama integrations in Home Assistant with different models:
- Add a second Ollama integration (Settings → Devices & Services → Add Integration → Ollama)
- Point it to the same Ollama URL but select a different model
- Use the specific agent_id in each automation
# Fast model for voice commands
- service: conversation.process
data:
agent_id: conversation.ollama_fast # llama3.2:3b
text: "{{ trigger.sentence }}"
# Smart model for energy analysis
- service: conversation.process
data:
agent_id: conversation.ollama_smart # qwen2.5:7b
text: "Analyze energy usage..."
Practical Automations You Can Use Today {#practical-automations}
Morning Briefing
- alias: "AI Morning Briefing"
trigger:
- platform: state
entity_id: binary_sensor.bedroom_motion
to: "on"
condition:
- condition: time
after: "05:30:00"
before: "09:00:00"
- condition: state
entity_id: input_boolean.morning_briefing_done
state: "off"
action:
- service: input_boolean.turn_on
entity_id: input_boolean.morning_briefing_done
- service: conversation.process
data:
agent_id: conversation.ollama
text: >
Create a brief morning summary:
- Weather: {{ states('weather.home') }}, {{ state_attr('weather.home', 'temperature') }}F
- Calendar: {{ states('sensor.next_calendar_event') }}
- Commute: {{ states('sensor.commute_time') }} minutes
- Energy yesterday: {{ states('sensor.daily_energy') }} kWh
Keep it under 4 sentences.
response_variable: briefing
- service: tts.speak
target:
entity_id: media_player.kitchen_speaker
data:
message: "{{ briefing.response.speech.plain.speech }}"
- service: input_boolean.turn_off
entity_id: input_boolean.morning_briefing_done
# Reset at midnight via another automation
Smart Thermostat Logic
Instead of static schedules, let AI adapt to conditions:
- alias: "AI Thermostat Adjustment"
trigger:
- platform: time_pattern
minutes: "/30" # Every 30 minutes
action:
- service: conversation.process
data:
agent_id: conversation.ollama
text: >
Current conditions:
- Indoor temp: {{ states('sensor.indoor_temperature') }}F
- Outdoor temp: {{ states('sensor.outdoor_temperature') }}F
- Humidity: {{ states('sensor.indoor_humidity') }}%
- People home: {{ states('sensor.people_count') }}
- Time: {{ now().strftime('%I:%M %p') }}
- Current setpoint: {{ state_attr('climate.thermostat', 'temperature') }}F
- Electricity rate: {{ states('sensor.electricity_rate') }}/kWh
Based on these conditions, recommend a thermostat setpoint.
Consider: energy cost, comfort, and time of day.
Respond with ONLY a number (the temperature in F).
response_variable: temp_suggestion
- service: climate.set_temperature
target:
entity_id: climate.thermostat
data:
temperature: "{{ temp_suggestion.response.speech.plain.speech | float }}"
Goodnight Routine
- alias: "AI Goodnight Check"
trigger:
- platform: conversation
command:
- "goodnight"
- "going to bed"
- "bedtime"
action:
- service: conversation.process
data:
agent_id: conversation.ollama
text: >
Running bedtime checklist:
- Front door locked: {{ states('lock.front_door') }}
- Back door locked: {{ states('lock.back_door') }}
- Garage closed: {{ states('cover.garage_door') }}
- Stove off: {{ states('switch.stove_monitor') }}
- Windows: {{ states('binary_sensor.windows') }}
- Lights still on: {{ states.light | selectattr('state','eq','on') | map(attribute='name') | list }}
Report any issues. If everything looks good, confirm all clear.
response_variable: checklist
# Turn off all lights except bedroom
- service: light.turn_off
target:
entity_id: all
- service: light.turn_on
target:
entity_id: light.bedroom
data:
brightness: 10
color_temp_kelvin: 2200
# Lock any unlocked doors
- service: lock.lock
target:
entity_id:
- lock.front_door
- lock.back_door
# Report status
- service: tts.speak
target:
entity_id: media_player.bedroom_speaker
data:
message: "{{ checklist.response.speech.plain.speech }}"
Running on a Raspberry Pi 5 {#raspberry-pi}
The Pi 5 is the most common Home Assistant hardware. Running Ollama alongside it requires some tuning.
Memory Management
With 8 GB total, Home Assistant uses about 1.5 GB, leaving 6.5 GB for Ollama. Llama 3.2 3B in Q4 quantization needs about 2.1 GB of RAM during inference. That leaves 4.4 GB of headroom for other services.
# Check available memory
free -h
# Monitor Ollama memory usage
watch -n 5 'ollama ps'
# Set Ollama to unload models after 5 minutes of inactivity
# This frees RAM when AI is not needed
export OLLAMA_KEEP_ALIVE=5m
Performance Optimization
# Ensure the Pi 5 runs at maximum clock speed during AI inference
echo "performance" | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
# Increase swap to handle occasional spikes
sudo dphys-swapfile swapoff
sudo sed -i 's/CONF_SWAPSIZE=.*/CONF_SWAPSIZE=2048/' /etc/dphys-swapfile
sudo dphys-swapfile setup
sudo dphys-swapfile swapon
Benchmark: Pi 5 Response Times
I measured end-to-end response time (from Assist command to action execution):
| Command Type | Model | Response Time |
|---|---|---|
| "Turn on lights" | Llama 3.2 3B Q4 | 1.5s |
| "Set thermostat to 72" | Llama 3.2 3B Q4 | 1.8s |
| "Movie night" (scene) | Llama 3.2 3B Q4 | 2.1s |
| "Is anyone home?" (sensor query) | Llama 3.2 3B Q4 | 2.4s |
| Energy analysis | Llama 3.2 3B Q4 | 4.8s |
| Security check (multi-sensor) | Llama 3.2 3B Q4 | 3.2s |
All under 5 seconds. For voice commands, 1.5-2.5 seconds feels responsive. The energy analysis is slower but runs as a background automation, so latency does not matter.
For a complete guide to running models on Pi hardware, see our Raspberry Pi 5 LLM guide.
Voice Control Setup {#voice-setup}
Home Assistant has built-in voice pipeline support. Combined with Ollama, you get a fully local voice assistant.
Option 1: Phone App (Easiest)
The Home Assistant mobile app includes a microphone button for Assist. Tap it, speak your command, and the AI processes it through Ollama. No additional hardware needed.
Option 2: Satellite Speakers
For room-by-room voice control, set up voice satellites:
# ESPHome voice satellite configuration
# Flash this to an ESP32-S3 with a microphone
esphome:
name: kitchen-voice
platform: esp32
board: esp32-s3-devkitc-1
microphone:
- platform: i2s_audio
id: mic
i2s_din_pin: GPIO16
channel: left
pdm: true
voice_assistant:
microphone: mic
use_wake_word: true
on_wake_word_detected:
- light.turn_on:
id: led
on_stt_end:
- light.turn_off:
id: led
Option 3: Wyoming + Whisper (Complete Local Pipeline)
For a fully private voice pipeline (wake word + speech-to-text + AI + text-to-speech):
# Install Wyoming Whisper (speech-to-text)
docker run -d \
--name wyoming-whisper \
-p 10300:10300 \
-v whisper-data:/data \
rhasspy/wyoming-whisper \
--model small-int8 \
--language en
# Install Wyoming Piper (text-to-speech)
docker run -d \
--name wyoming-piper \
-p 10200:10200 \
-v piper-data:/data \
rhasspy/wyoming-piper \
--voice en_US-lessac-medium
# Install openWakeWord
docker run -d \
--name wyoming-openwakeword \
-p 10400:10400 \
rhasspy/wyoming-openwakeword
Configure in HA: Settings → Voice assistants → Add assistant → Select Wyoming services as STT, TTS, and wake word providers. Set Ollama as the conversation agent.
Now you have a complete local voice pipeline: wake word detection → Whisper speech-to-text → Ollama reasoning → Piper text-to-speech. No cloud involved at any stage.
Security Considerations {#security}
Network Isolation
If Ollama runs on a separate machine, restrict access:
# Only allow Home Assistant's IP to access Ollama
sudo iptables -A INPUT -p tcp --dport 11434 -s 192.168.1.100 -j ACCEPT
sudo iptables -A INPUT -p tcp --dport 11434 -j DROP
# Or use OLLAMA_ORIGINS to restrict CORS
export OLLAMA_ORIGINS="http://192.168.1.100:8123"
Prompt Injection Prevention
When passing sensor data to the AI, sanitize the values:
# BAD — sensor value could contain injection
text: "The doorbell name is {{ states('sensor.doorbell_name') }}"
# BETTER — validate expected format
text: >
The temperature is {{ states('sensor.temperature') | float(0) }}F.
Only respond with a temperature number.
For automations that control locks, doors, or security systems, always add a confirmation step rather than blindly executing AI suggestions.
Best Models for Security Tasks
Models with lower creativity (temperature 0) are safer for security-critical automations. Never use high-temperature settings for automations that control physical access.
Monitoring and Debugging {#monitoring}
Check AI Response Quality
# Log all AI interactions for review
- alias: "Log AI Conversations"
trigger:
- platform: event
event_type: conversation_agent_response
action:
- service: logbook.log
data:
name: "AI Response"
message: >
Input: {{ trigger.event.data.user_input }}
Response: {{ trigger.event.data.response }}
Agent: {{ trigger.event.data.agent_id }}
Ollama Health Check
# Alert if Ollama goes down
- alias: "Ollama Health Monitor"
trigger:
- platform: time_pattern
minutes: "/5"
action:
- service: rest_command.check_ollama
# rest_command in configuration.yaml:
# check_ollama:
# url: "http://localhost:11434/api/tags"
# method: GET
Frequently Asked Questions
Q: Can Home Assistant talk to Ollama without any cloud service?
A: Yes. Home Assistant's built-in Ollama integration connects directly to your local Ollama instance over your LAN. No internet connection is required. Voice processing can also be fully local using Wyoming + Whisper + Piper.
Q: Will a Raspberry Pi 5 handle both Home Assistant and Ollama?
A: Yes, with the 8GB model. Llama 3.2 3B in Q4 quantization uses about 2.1 GB of RAM. Home Assistant uses about 1.5 GB. That leaves 4+ GB of headroom. Response times for simple commands are 1.5-2 seconds — very usable for daily voice control.
Q: Is the AI smart enough to control my home reliably?
A: For simple commands (lights, thermostat, locks), Llama 3.2 3B is very reliable — over 95% accuracy in my testing. For complex reasoning (energy optimization, anomaly detection), a 7B model on separate hardware performs significantly better. Start simple and expand.
Q: Can I use this for security automations?
A: Yes, but with caution. Use AI for monitoring and alerts, not direct control of locks or alarms. Always add confirmation steps for security-critical actions. Set temperature to 0 for deterministic responses. Log all AI-triggered security actions for audit.
Q: What about latency for voice commands?
A: On a Pi 5 with Llama 3.2 3B, voice commands complete in 1.5-2.5 seconds. On a separate machine with a 7B model, 2-4 seconds. On a GPU-equipped machine, under 1 second. All are faster than waiting for a cloud round-trip during an internet outage.
Q: Does this work with all Home Assistant devices?
A: The AI can control any device exposed through Home Assistant, including Zigbee, Z-Wave, WiFi, Matter, and Thread devices. You control which entities the AI can access through the exposed entities setting.
Go from reading about AI to building with AI
10 structured courses. Hands-on projects. Runs on your machine. Start free.
Enjoyed this? There are 10 full courses waiting.
10 complete AI courses. From fundamentals to production. Everything runs on your hardware.
Build Real AI on Your Machine
RAG, agents, NLP, vision, MLOps — chapters across 10 courses that take you from reading about AI to building AI.
Want structured AI education?
10 courses, 160+ chapters, from $9. Understand AI, don't just use it.
Continue Your Local AI Journey
Comments (0)
No comments yet. Be the first to share your thoughts!