Local AI Home Security: Analyze Cameras Privately Without the Cloud
Want to go deeper than this article?
The AI Learning Path covers this topic and more — hands-on chapters across 10 courses across 10 courses.
Local AI Home Security: Analyze Cameras Privately Without the Cloud
Published on April 23, 2026 • 19 min read
I disconnected my Ring doorbell on a Tuesday evening in 2024 after the third "subscription required to view event" pop-up that month. By Saturday I had built a replacement that does more than Ring ever did, runs entirely on my own hardware, and costs about $4 per month in electricity. It tells me, in plain English, "the Amazon driver dropped a package and left at 3:42 PM" instead of just "motion detected." That difference between cloud security cameras and a properly built local stack is the whole point of this guide.
What follows is the exact build I run at home and have deployed at three friends' houses. Two RTSP cameras minimum, one Coral USB accelerator, an old desktop, and a vision LLM doing the actual reasoning. Total parts cost under $400 if you buy used.
Quick Start: Detect a Person in 10 Minutes {#quick-start}
# 1. Spin up Frigate (the NVR that handles motion + object detection)
docker run -d \
--name frigate \
--gpus all \
-v /opt/frigate/config:/config \
-v /opt/frigate/storage:/media/frigate \
-p 5000:5000 -p 8554:8554 \
--device /dev/bus/usb \
ghcr.io/blakeblackshear/frigate:stable
# 2. Pull a vision model into Ollama for descriptive alerts
ollama pull llava:7b
# 3. Test it on a snapshot
curl http://localhost:11434/api/generate -d '{
"model": "llava:7b",
"prompt": "Describe what is happening in this image in one sentence.",
"images": ["'$(base64 -w0 ./driveway.jpg)'"],
"stream": false
}'
That returns something like "A delivery driver in a brown uniform is placing a package on the front steps." Wire it into Home Assistant and you have alerts your aunt can read.
Table of Contents
- Why Cloud Security Cameras Are Broken
- The Local Stack Architecture
- Hardware: What to Buy
- Frigate: Real-Time Object Detection
- Ollama Vision: Descriptive AI Alerts
- Home Assistant Integration
- Privacy and Compliance
- Storage and Retention
- Pitfalls and Fixes
- Cloud vs Local Cost Comparison
- FAQs
Why Cloud Security Cameras Are Broken {#why-broken}
The pitch for Ring, Nest, and Arlo is "smart cameras." The reality is:
- Subscription paywalls for basic features. Ring's package detection requires Ring Protect ($4-$20/month). Nest's intelligent alerts require Nest Aware ($8-$15/month).
- Your video lives on someone else's hard drive. Amazon owns Ring. Google owns Nest. Both have shared footage with law enforcement without warrants in documented cases.
- AI happens in the cloud. Every snapshot is uploaded for analysis. The "AI" is a remote API the camera depends on staying online.
- Vendor death. Wyze killed an $8/month service in 2025 and locked features behind it. Insteon went bankrupt and bricked thousands of devices.
A local AI security stack flips every one of these. Hardware you own. Footage that never leaves your network. AI that runs whether your ISP is up or not. Zero recurring cost.
For the broader case for self-hosting, see our GDPR-compliant local AI post — the same arguments apply to camera footage.
The Local Stack Architecture {#architecture}
+------------+ RTSP +------------+
| Cameras | --------------> | Frigate |
+------------+ | (NVR + CV) |
+------------+
| ^
MQTT | | Coral / GPU
v |
+------------+ +------------+
| HA | <-----> | Ollama |
| (alerts & | HTTP | (vision |
| control) | | LLM) |
+------------+ +------------+
|
v
push, SMS, email
Five components:
- Cameras speaking RTSP (any modern PoE camera).
- Frigate for fast object detection (1-50 ms per frame).
- Coral USB Accelerator to run Frigate's detector at full FPS on a low-power CPU.
- Ollama running a vision model (LLaVA, Qwen2.5-VL, MiniCPM-V) for natural-language scene descriptions.
- Home Assistant as the brain wiring it all together and routing notifications.
Frigate is the realtime layer. It alerts in milliseconds when a "person" or "car" enters a zone. Ollama is the slower, smarter layer. When Frigate fires an event, Home Assistant grabs the snapshot, sends it to Ollama with a prompt, and the AI returns a description that goes in the push notification.
Hardware: What to Buy {#hardware}
Core compute
| Component | Spec | Cost |
|---|---|---|
| Server | Used Dell OptiPlex 7060 SFF (i5-8500, 16GB RAM) | $120 |
| GPU | NVIDIA RTX 3060 12GB (used) | $200 |
| Coral USB Accelerator | Google's Edge TPU | $60 |
| 6 TB Surveillance HDD | WD Purple or Seagate Skyhawk | $130 |
| Subtotal | $510 |
The GPU is for Ollama vision inference; the Coral USB handles Frigate's object detection. Together they free your CPU to handle the rest of the home automation load. If you do not need descriptive AI alerts, skip the GPU and just run Frigate + Coral for $310 total.
Cameras
| Model | Spec | Cost |
|---|---|---|
| Reolink RLC-810A | 4K, PoE, RTSP | $90 each |
| Amcrest IP4M-1051EW | 4MP, PoE, RTSP | $80 each |
| Hikvision DS-2CD2143G2-I | 4MP, PoE, RTSP, low-light king | $130 each |
Two principles when buying cameras: must speak RTSP (no cloud-only cameras) and PoE preferred over WiFi (more reliable, single cable for power and data).
For the broader hardware case, our budget local AI machine post covers the same OptiPlex/refurbished-business-PC pattern.
Frigate: Real-Time Object Detection {#frigate}
Frigate is the open-source NVR with built-in object detection. It is the fastest local CV pipeline I have used for security cameras, and it runs about 8-15 ms per frame with a Coral USB Accelerator.
Minimal config
Save as /opt/frigate/config/config.yml:
mqtt:
host: 192.168.1.10
user: frigate
password: !secret mqtt_password
detectors:
coral:
type: edgetpu
device: usb
cameras:
driveway:
ffmpeg:
inputs:
- path: rtsp://admin:pass@192.168.1.50:554/h264Preview_01_main
roles:
- record
- detect
detect:
width: 1280
height: 720
fps: 5
objects:
track:
- person
- car
- dog
record:
enabled: true
retain:
days: 14
mode: motion
front_door:
ffmpeg:
inputs:
- path: rtsp://admin:pass@192.168.1.51:554/h264Preview_01_main
roles:
- record
- detect
detect:
width: 1280
height: 720
fps: 5
objects:
track:
- person
- package
Custom zones
Drawing zones on the camera image is the difference between getting alerts every time a leaf blows by and getting alerts only when someone steps into the porch.
cameras:
driveway:
zones:
driveway_main:
coordinates: 0,720,1280,720,1280,400,0,400
objects:
- person
- car
The numbers are pixel coordinates. Use Frigate's built-in zone editor to draw them visually.
Coral vs GPU
A Coral USB Accelerator at $60 outperforms a $400 GPU for Frigate's detector workload. The Coral is purpose-built for INT8 inference on small models like SSD-MobileNet. Use the GPU only for Ollama; let Coral handle Frigate.
Ollama Vision: Descriptive AI Alerts {#ollama-vision}
This is the layer that turns "person detected" into "an Amazon driver placed a package on the porch and walked back to the truck."
Models worth running
| Model | Params | VRAM (Q4) | Latency on RTX 3060 | Best For |
|---|---|---|---|---|
| llava:7b | 7B | 5 GB | 1.2 s | General descriptions |
| llava:13b | 13B | 9 GB | 2.4 s | Sharper detail extraction |
| qwen2.5-vl:7b | 7B | 5.5 GB | 1.3 s | Best instruction following |
| minicpm-v:8b | 8B | 6 GB | 1.5 s | Strong on small objects (license plates) |
| moondream2 | 1.8B | 2 GB | 0.3 s | Sub-second alerts on cheap hardware |
For most homes, qwen2.5-vl:7b is the right balance. moondream2 is the speed king if you want sub-300 ms alerts and can accept slightly less detailed descriptions.
Prompt that works
The naive prompt ("describe this image") gives florid, useless paragraphs. The prompt I run in production:
You are a security camera analyst. Describe this image in ONE sentence.
Focus on: who is present, what they are doing, and any unusual behavior.
Do not describe the camera angle, time of day, or scenery.
If nothing notable is happening, reply only "Routine activity."
That gives outputs like:
- "A man in a dark hoodie is walking up the driveway carrying a backpack."
- "A USPS driver delivered a small package to the front door and returned to the truck."
- "Routine activity."
The "Routine activity" output is critical. Without it, the LLM hallucinates importance into every harmless event.
Running the inference
import base64
import requests
def describe(image_path: str) -> str:
with open(image_path, "rb") as f:
b64 = base64.b64encode(f.read()).decode()
r = requests.post("http://localhost:11434/api/generate", json={
"model": "qwen2.5-vl:7b",
"prompt": SECURITY_PROMPT,
"images": [b64],
"stream": False,
"options": {"num_predict": 60, "temperature": 0.2}
}, timeout=10)
return r.json()["response"].strip()
print(describe("./snapshot.jpg"))
The 0.2 temperature keeps descriptions deterministic. num_predict: 60 caps the response at one sentence and saves another 200-400 ms.
For a deeper integration pattern, our Ollama Python API guide covers production usage including streaming and concurrent calls.
Home Assistant Integration {#home-assistant}
Home Assistant is the glue. Frigate's MQTT events trigger automations that call Ollama and route notifications.
Automation: AI-described front door alert
automation:
- alias: "AI front door alert"
trigger:
- platform: mqtt
topic: frigate/events
payload: "new"
condition:
- "{{ trigger.payload_json['after']['camera'] == 'front_door' }}"
- "{{ trigger.payload_json['after']['label'] in ['person', 'package'] }}"
action:
- variables:
event_id: "{{ trigger.payload_json['after']['id'] }}"
snapshot_url: "http://frigate:5000/api/events/{{ event_id }}/snapshot.jpg"
- service: rest_command.describe_image
data:
url: "{{ snapshot_url }}"
response_variable: ai_response
- service: notify.mobile_app_phone
data:
title: "Front Door"
message: "{{ ai_response['content'] }}"
data:
image: "{{ snapshot_url }}"
rest_command.describe_image is a custom command that POSTs the snapshot URL to a small Python service that calls Ollama and returns the description. Configure it in configuration.yaml:
rest_command:
describe_image:
url: "http://localhost:8765/describe"
method: post
payload: '{"url": "{{ url }}"}'
content_type: "application/json"
timeout: 8
The 8-second timeout is the latency budget. If Ollama takes longer, the notification fires with the default "person detected" message instead. You never want a hung AI to block a security alert.
For a working Home Assistant install with privacy controls baked in, our local AI Home Assistant integration post is the companion guide.
Privacy and Compliance {#privacy}
What stays local
- Every camera frame
- Every detection event
- Every AI-generated description
- Every video file
What leaves the network
- Push notifications (Apple/Google's notification servers see the title and message text — not the image)
- Optional: a Tailscale or WireGuard tunnel for remote viewing
If you want zero data on third-party servers, use ntfy.sh self-hosted for push instead of mobile_app, and Tailscale (which keeps content end-to-end encrypted between your devices) for remote access.
Legal considerations
Cameras pointed at your own property are legal in nearly every jurisdiction. Cameras that capture audio fall under wiretapping laws — disable audio or check your state's two-party consent rules first. Cameras that capture neighbors' property may violate local privacy ordinances; mask out neighboring areas using Frigate's motion.mask configuration.
The Electronic Frontier Foundation's surveillance self-defense is the clearest plain-English reference for camera privacy law.
Storage and Retention {#storage}
Storage math
| Camera Resolution | Bitrate | 1 day continuous | 14 days continuous |
|---|---|---|---|
| 1080p H.264 | 4 Mbps | 43 GB | 600 GB |
| 4MP H.264 | 6 Mbps | 65 GB | 910 GB |
| 4K H.265 | 10 Mbps | 108 GB | 1.5 TB |
For 4 cameras at 4MP recording motion-only (typical 30% of the day), expect 80-120 GB per day. A 6 TB drive holds 50-70 days. A 12 TB drive holds 100+ days.
Retention strategy
Frigate's retention config:
record:
enabled: true
retain:
days: 7
mode: motion # only keep motion segments past day 7
events:
retain:
default: 30
mode: motion
This keeps 7 days of full continuous footage and 30 days of just the motion events. Past that, footage is automatically pruned.
Pitfalls and Fixes {#pitfalls}
Pitfall 1: "Person" alerts on swaying trees
Cause: small motion in zones the detector flags as person.
Fix: raise min_area and threshold in the detect config. min_area: 5000 filters out anything smaller than ~70x70 pixels at 1280p.
Pitfall 2: Ollama times out at 2 AM
Cause: model unloaded after OLLAMA_KEEP_ALIVE (default 5 min).
Fix: set OLLAMA_KEEP_ALIVE=-1 in the Ollama service environment to keep the model resident, or set it to 24h and accept the rare cold start.
Pitfall 3: 4K cameras saturate the network
Cause: 4 simultaneous 10 Mbps streams = 40 Mbps sustained.
Fix: put cameras on a separate VLAN with a managed switch, or run Frigate on a machine with a 2.5 GbE NIC. Wireless cameras at 4K are unreliable; PoE cameras solve this.
Pitfall 4: AI hallucinates intruders
Cause: prompts that demand drama get drama.
Fix: the "Routine activity" instruction in the prompt above. Also lower temperature to 0.1-0.2.
Pitfall 5: Storage fills up faster than expected
Cause: continuous record + motion record both running.
Fix: set mode: motion for the retain block. Most homes do not need 24/7 footage past the first week.
Cloud vs Local Cost Comparison {#cost-comparison}
For a typical 4-camera setup with package and person AI alerts:
Cloud (Ring or Nest equivalent)
| Item | Year 1 | Year 5 |
|---|---|---|
| 4 cameras at $200 each | $800 | $800 |
| Ring Protect Plus ($10/mo) | $120 | $600 |
| Cloud storage upgrade | $60 | $300 |
| Total | $980 | $1,700 |
Local stack
| Item | Year 1 | Year 5 |
|---|---|---|
| 4 cameras at $90 (RTSP) | $360 | $360 |
| Server + GPU + Coral | $510 | $510 |
| 6 TB HDD | $130 | $130 |
| Electricity (90W avg, 24/7) | $95 | $475 |
| Total | $1,095 | $1,475 |
The local build crosses cost parity at month 14 and saves $225 by year 5 — and that ignores the value of feature differences. Ring still cannot tell you "the kid next door is climbing the fence." Local AI can.
What Cloud Security AI Cannot Do (And Local Can)
- Custom prompts. Ask the AI specifically for what you care about ("Is the gate open?", "Is anyone wearing the uniform of a delivery driver?").
- Multi-camera reasoning. Send two snapshots to the model and ask "Is the same person who just walked past the driveway now at the front door?"
- License plate recall. With minicpm-v:8b you can extract plates and check them against a homeowner-defined allowlist.
- Audio integration. Whisper transcribes a doorbell intercom and the LLM decides whether to wake you.
- Long-term pattern recognition. Embed every event description and search them: "show me every time the mail truck stopped between 1 and 3 PM last month."
None of this is on the cloud roadmap because it would explode their inference costs. On a $400 home server it is just another automation.
Conclusion
The combination of Frigate, a Coral USB Accelerator, and a vision model in Ollama produces a security setup that is materially smarter than any cloud product on the market — and it is yours. The footage never leaves your network, the AI runs whether the ISP is up or not, and the only ongoing cost is electricity.
Start small. Two cameras, a Coral, and an existing PC running Frigate without the AI layer. That alone replaces Ring's basic functionality. Add Ollama with qwen2.5-vl:7b once the NVR is stable. The whole transition can be done over two weekends, after which the only remaining question is what to do with the $200 a year you used to spend on a Ring subscription.
Want the full home-automation picture? Our local AI + Home Assistant guide and private AI knowledge base extend this stack into voice control and document Q&A.
Go from reading about AI to building with AI
10 structured courses. Hands-on projects. Runs on your machine. Start free.
Enjoyed this? There are 10 full courses waiting.
10 complete AI courses. From fundamentals to production. Everything runs on your hardware.
Build Real AI on Your Machine
RAG, agents, NLP, vision, and MLOps - chapters across 10 courses that take you from reading about AI to building AI.
Want structured AI education?
10 courses, 160+ chapters, from $9. Understand AI, don't just use it.
Continue Your Local AI Journey
Comments (0)
No comments yet. Be the first to share your thoughts!