What is the best local AI for analyzing security cameras?

Frigate handles realtime object detection in 8-15 ms per frame with a Coral USB Accelerator. For natural-language scene descriptions, qwen2.5-vl:7b in Ollama is the best balance of speed (~1.3 s per snapshot on an RTX 3060) and detail. moondream2 is the speed champion at 0.3 s per snapshot if you want sub-second alerts. Combine the two: Frigate for instant 'person detected' alerts, Ollama vision for the descriptive follow-up.

Do I need a GPU for local home security AI?

Not strictly. Frigate runs well on a $60 Coral USB Accelerator with no GPU. If you want descriptive AI alerts via Ollama vision models, an entry-level GPU like a used RTX 3060 12GB ($200) is the cheapest path. CPU-only inference of vision models is technically possible but takes 8-30 seconds per image, which is too slow for security alerts.

Can I replace my Ring or Nest with local AI?

Yes, with caveats. Local stacks require RTSP-capable cameras (Ring and Nest are cloud-only and cannot be used). Replacement cameras like Reolink, Amcrest, and Hikvision PoE units cost $80-130 each and stream RTSP directly to Frigate. Once switched, you get every Ring Protect feature (package detection, person/animal classification, smart zones) plus features Ring lacks like custom AI prompts and multi-camera reasoning. No subscription, no third-party data.

How much storage do I need for 4 security cameras?

For 4 cameras at 4MP recording motion-only events (typical 30% of the day), plan on 80-120 GB per day. A 6 TB surveillance-rated drive (WD Purple or Seagate Skyhawk) holds 50-70 days at this rate. For 24/7 continuous recording at 4K H.265, you need closer to 500 GB per day per camera, so a 12 TB drive is the minimum for 4 cameras with 7+ day retention.

Is local AI security legal where I live?

In most jurisdictions, cameras pointed at your own property are legal. Cameras that capture audio may violate two-party-consent wiretapping laws — disable audio or check your state laws. Cameras that capture neighboring property can violate local privacy ordinances; use Frigate's motion masks to exclude those areas. The Electronic Frontier Foundation's surveillance self-defense guide is the clearest plain-English reference. None of these laws change between cloud and local cameras — the privacy rules are identical.

Can local AI security work without internet?

Yes, that is one of the main advantages over cloud cameras. Frigate, Ollama, and Home Assistant all run on your LAN with no internet dependency. Cameras detect events, AI describes them, and notifications fire to devices on the same network even with the WAN cable unplugged. For remote access, layer in a self-hosted WireGuard or Tailscale tunnel — both work with end-to-end encryption and no third-party server seeing your video.

How much does a complete local AI security setup cost?

For 4 cameras with descriptive AI: roughly $1,095 first-year cost (server $120, GPU $200, Coral $60, 6 TB HDD $130, four cameras at $90 each, electricity $95). After the hardware is paid off, ongoing cost is $95/year in electricity. The cloud equivalent (Ring Protect Plus or Nest Aware for the same coverage) runs $180-300 per year forever. Local crosses the breakeven point around month 14 and saves money every year afterward.

Can I get sub-second alerts with local AI?

Yes. Frigate's object detection runs at 8-15 ms per frame on a Coral USB Accelerator, so 'person detected' alerts fire in under 100 ms end-to-end. The slower descriptive AI alert (with a vision LLM like qwen2.5-vl) adds 1-2 seconds. For sub-second descriptive alerts, swap in moondream2 (1.8B parameters, 300 ms inference). The two-tier design — fast detection then slower description — is the trick that makes local AI feel as snappy as cloud while delivering richer information.

Local AI Home Security: Analyze Cameras Privately Without the Cloud

Published on April 23, 2026 • 19 min read

I disconnected my Ring doorbell on a Tuesday evening in 2024 after the third "subscription required to view event" pop-up that month. By Saturday I had built a replacement that does more than Ring ever did, runs entirely on my own hardware, and costs about $4 per month in electricity. It tells me, in plain English, "the Amazon driver dropped a package and left at 3:42 PM" instead of just "motion detected." That difference between cloud security cameras and a properly built local stack is the whole point of this guide.

What follows is the exact build I run at home and have deployed at three friends' houses. Two RTSP cameras minimum, one Coral USB accelerator, an old desktop, and a vision LLM doing the actual reasoning. Total parts cost under $400 if you buy used.

Quick Start: Detect a Person in 10 Minutes {#quick-start}

# 1. Spin up Frigate (the NVR that handles motion + object detection)
docker run -d \
  --name frigate \
  --gpus all \
  -v /opt/frigate/config:/config \
  -v /opt/frigate/storage:/media/frigate \
  -p 5000:5000 -p 8554:8554 \
  --device /dev/bus/usb \
  ghcr.io/blakeblackshear/frigate:stable

# 2. Pull a vision model into Ollama for descriptive alerts
ollama pull llava:7b

# 3. Test it on a snapshot
curl http://localhost:11434/api/generate -d '{
  "model": "llava:7b",
  "prompt": "Describe what is happening in this image in one sentence.",
  "images": ["'$(base64 -w0 ./driveway.jpg)'"],
  "stream": false
}'

That returns something like "A delivery driver in a brown uniform is placing a package on the front steps." Wire it into Home Assistant and you have alerts your aunt can read.

Why Cloud Security Cameras Are Broken
The Local Stack Architecture
Hardware: What to Buy
Frigate: Real-Time Object Detection
Ollama Vision: Descriptive AI Alerts
Home Assistant Integration
Privacy and Compliance
Storage and Retention
Pitfalls and Fixes
Cloud vs Local Cost Comparison
FAQs

Why Cloud Security Cameras Are Broken {#why-broken}

The pitch for Ring, Nest, and Arlo is "smart cameras." The reality is:

Subscription paywalls for basic features. Ring's package detection requires Ring Protect ($4-$20/month). Nest's intelligent alerts require Nest Aware ($8-$15/month).
Your video lives on someone else's hard drive. Amazon owns Ring. Google owns Nest. Both have shared footage with law enforcement without warrants in documented cases.
AI happens in the cloud. Every snapshot is uploaded for analysis. The "AI" is a remote API the camera depends on staying online.
Vendor death. Wyze killed an $8/month service in 2025 and locked features behind it. Insteon went bankrupt and bricked thousands of devices.

A local AI security stack flips every one of these. Hardware you own. Footage that never leaves your network. AI that runs whether your ISP is up or not. Zero recurring cost.

For the broader case for self-hosting, see our GDPR-compliant local AI post — the same arguments apply to camera footage.

The Local Stack Architecture {#architecture}

+------------+      RTSP       +------------+
|  Cameras   | --------------> |  Frigate   |
+------------+                  | (NVR + CV) |
                                +------------+
                                  |   ^
                          MQTT    |   | Coral / GPU
                                  v   |
                                +------------+         +------------+
                                |    HA      | <-----> |  Ollama    |
                                | (alerts &  |  HTTP   | (vision    |
                                |  control)  |         |  LLM)      |
                                +------------+         +------------+
                                  |
                                  v
                            push, SMS, email

Five components:

Cameras speaking RTSP (any modern PoE camera).
Frigate for fast object detection (1-50 ms per frame).
Coral USB Accelerator to run Frigate's detector at full FPS on a low-power CPU.
Ollama running a vision model (LLaVA, Qwen2.5-VL, MiniCPM-V) for natural-language scene descriptions.
Home Assistant as the brain wiring it all together and routing notifications.

Frigate is the realtime layer. It alerts in milliseconds when a "person" or "car" enters a zone. Ollama is the slower, smarter layer. When Frigate fires an event, Home Assistant grabs the snapshot, sends it to Ollama with a prompt, and the AI returns a description that goes in the push notification.

Hardware: What to Buy {#hardware}

Core compute

Component	Spec	Cost
Server	Used Dell OptiPlex 7060 SFF (i5-8500, 16GB RAM)	$120
GPU	NVIDIA RTX 3060 12GB (used)	$200
Coral USB Accelerator	Google's Edge TPU	$60
6 TB Surveillance HDD	WD Purple or Seagate Skyhawk	$130
Subtotal		$510

The GPU is for Ollama vision inference; the Coral USB handles Frigate's object detection. Together they free your CPU to handle the rest of the home automation load. If you do not need descriptive AI alerts, skip the GPU and just run Frigate + Coral for $310 total.

Cameras

Model	Spec	Cost
Reolink RLC-810A	4K, PoE, RTSP	$90 each
Amcrest IP4M-1051EW	4MP, PoE, RTSP	$80 each
Hikvision DS-2CD2143G2-I	4MP, PoE, RTSP, low-light king	$130 each

Two principles when buying cameras: must speak RTSP (no cloud-only cameras) and PoE preferred over WiFi (more reliable, single cable for power and data).

For the broader hardware case, our budget local AI machine post covers the same OptiPlex/refurbished-business-PC pattern.

Frigate: Real-Time Object Detection {#frigate}

Frigate is the open-source NVR with built-in object detection. It is the fastest local CV pipeline I have used for security cameras, and it runs about 8-15 ms per frame with a Coral USB Accelerator.

Minimal config

Save as /opt/frigate/config/config.yml:

mqtt:
  host: 192.168.1.10
  user: frigate
  password: !secret mqtt_password

detectors:
  coral:
    type: edgetpu
    device: usb

cameras:
  driveway:
    ffmpeg:
      inputs:
        - path: rtsp://admin:pass@192.168.1.50:554/h264Preview_01_main
          roles:
            - record
            - detect
    detect:
      width: 1280
      height: 720
      fps: 5
    objects:
      track:
        - person
        - car
        - dog
    record:
      enabled: true
      retain:
        days: 14
        mode: motion

  front_door:
    ffmpeg:
      inputs:
        - path: rtsp://admin:pass@192.168.1.51:554/h264Preview_01_main
          roles:
            - record
            - detect
    detect:
      width: 1280
      height: 720
      fps: 5
    objects:
      track:
        - person
        - package

Custom zones

Drawing zones on the camera image is the difference between getting alerts every time a leaf blows by and getting alerts only when someone steps into the porch.

cameras:
  driveway:
    zones:
      driveway_main:
        coordinates: 0,720,1280,720,1280,400,0,400
        objects:
          - person
          - car

The numbers are pixel coordinates. Use Frigate's built-in zone editor to draw them visually.

Coral vs GPU

A Coral USB Accelerator at $60 outperforms a $400 GPU for Frigate's detector workload. The Coral is purpose-built for INT8 inference on small models like SSD-MobileNet. Use the GPU only for Ollama; let Coral handle Frigate.

Ollama Vision: Descriptive AI Alerts {#ollama-vision}

This is the layer that turns "person detected" into "an Amazon driver placed a package on the porch and walked back to the truck."

Models worth running

Model	Params	VRAM (Q4)	Latency on RTX 3060	Best For
llava:7b	7B	5 GB	1.2 s	General descriptions
llava:13b	13B	9 GB	2.4 s	Sharper detail extraction
qwen2.5-vl:7b	7B	5.5 GB	1.3 s	Best instruction following
minicpm-v:8b	8B	6 GB	1.5 s	Strong on small objects (license plates)
moondream2	1.8B	2 GB	0.3 s	Sub-second alerts on cheap hardware

For most homes, qwen2.5-vl:7b is the right balance. moondream2 is the speed king if you want sub-300 ms alerts and can accept slightly less detailed descriptions.

Prompt that works

The naive prompt ("describe this image") gives florid, useless paragraphs. The prompt I run in production:

You are a security camera analyst. Describe this image in ONE sentence.
Focus on: who is present, what they are doing, and any unusual behavior.
Do not describe the camera angle, time of day, or scenery.
If nothing notable is happening, reply only "Routine activity."

That gives outputs like:

"A man in a dark hoodie is walking up the driveway carrying a backpack."
"A USPS driver delivered a small package to the front door and returned to the truck."
"Routine activity."

The "Routine activity" output is critical. Without it, the LLM hallucinates importance into every harmless event.

Running the inference

import base64
import requests

def describe(image_path: str) -> str:
    with open(image_path, "rb") as f:
        b64 = base64.b64encode(f.read()).decode()
    r = requests.post("http://localhost:11434/api/generate", json={
        "model": "qwen2.5-vl:7b",
        "prompt": SECURITY_PROMPT,
        "images": [b64],
        "stream": False,
        "options": {"num_predict": 60, "temperature": 0.2}
    }, timeout=10)
    return r.json()["response"].strip()

print(describe("./snapshot.jpg"))

The 0.2 temperature keeps descriptions deterministic. num_predict: 60 caps the response at one sentence and saves another 200-400 ms.

For a deeper integration pattern, our Ollama Python API guide covers production usage including streaming and concurrent calls.

Home Assistant Integration {#home-assistant}

Home Assistant is the glue. Frigate's MQTT events trigger automations that call Ollama and route notifications.

Automation: AI-described front door alert

automation:
  - alias: "AI front door alert"
    trigger:
      - platform: mqtt
        topic: frigate/events
        payload: "new"
    condition:
      - "{{ trigger.payload_json['after']['camera'] == 'front_door' }}"
      - "{{ trigger.payload_json['after']['label'] in ['person', 'package'] }}"
    action:
      - variables:
          event_id: "{{ trigger.payload_json['after']['id'] }}"
          snapshot_url: "http://frigate:5000/api/events/{{ event_id }}/snapshot.jpg"
      - service: rest_command.describe_image
        data:
          url: "{{ snapshot_url }}"
        response_variable: ai_response
      - service: notify.mobile_app_phone
        data:
          title: "Front Door"
          message: "{{ ai_response['content'] }}"
          data:
            image: "{{ snapshot_url }}"

rest_command.describe_image is a custom command that POSTs the snapshot URL to a small Python service that calls Ollama and returns the description. Configure it in configuration.yaml:

rest_command:
  describe_image:
    url: "http://localhost:8765/describe"
    method: post
    payload: '{"url": "{{ url }}"}'
    content_type: "application/json"
    timeout: 8

The 8-second timeout is the latency budget. If Ollama takes longer, the notification fires with the default "person detected" message instead. You never want a hung AI to block a security alert.

For a working Home Assistant install with privacy controls baked in, our local AI Home Assistant integration post is the companion guide.

Privacy and Compliance {#privacy}

What stays local

Every camera frame
Every detection event
Every AI-generated description
Every video file

What leaves the network

Push notifications (Apple/Google's notification servers see the title and message text — not the image)
Optional: a Tailscale or WireGuard tunnel for remote viewing

If you want zero data on third-party servers, use ntfy.sh self-hosted for push instead of mobile_app, and Tailscale (which keeps content end-to-end encrypted between your devices) for remote access.

Legal considerations

Cameras pointed at your own property are legal in nearly every jurisdiction. Cameras that capture audio fall under wiretapping laws — disable audio or check your state's two-party consent rules first. Cameras that capture neighbors' property may violate local privacy ordinances; mask out neighboring areas using Frigate's motion.mask configuration.

The Electronic Frontier Foundation's surveillance self-defense is the clearest plain-English reference for camera privacy law.

Storage and Retention {#storage}

Storage math

Camera Resolution	Bitrate	1 day continuous	14 days continuous
1080p H.264	4 Mbps	43 GB	600 GB
4MP H.264	6 Mbps	65 GB	910 GB
4K H.265	10 Mbps	108 GB	1.5 TB

For 4 cameras at 4MP recording motion-only (typical 30% of the day), expect 80-120 GB per day. A 6 TB drive holds 50-70 days. A 12 TB drive holds 100+ days.

Retention strategy

Frigate's retention config:

record:
  enabled: true
  retain:
    days: 7
    mode: motion       # only keep motion segments past day 7
  events:
    retain:
      default: 30
      mode: motion

This keeps 7 days of full continuous footage and 30 days of just the motion events. Past that, footage is automatically pruned.

Pitfalls and Fixes {#pitfalls}

Pitfall 1: "Person" alerts on swaying trees

Cause: small motion in zones the detector flags as person.

Fix: raise min_area and threshold in the detect config. min_area: 5000 filters out anything smaller than ~70x70 pixels at 1280p.

Pitfall 2: Ollama times out at 2 AM

Cause: model unloaded after OLLAMA_KEEP_ALIVE (default 5 min).

Fix: set OLLAMA_KEEP_ALIVE=-1 in the Ollama service environment to keep the model resident, or set it to 24h and accept the rare cold start.

Pitfall 3: 4K cameras saturate the network

Cause: 4 simultaneous 10 Mbps streams = 40 Mbps sustained.

Fix: put cameras on a separate VLAN with a managed switch, or run Frigate on a machine with a 2.5 GbE NIC. Wireless cameras at 4K are unreliable; PoE cameras solve this.

Pitfall 4: AI hallucinates intruders

Cause: prompts that demand drama get drama.

Fix: the "Routine activity" instruction in the prompt above. Also lower temperature to 0.1-0.2.

Pitfall 5: Storage fills up faster than expected

Cause: continuous record + motion record both running.

Fix: set mode: motion for the retain block. Most homes do not need 24/7 footage past the first week.

Cloud vs Local Cost Comparison {#cost-comparison}

For a typical 4-camera setup with package and person AI alerts:

Cloud (Ring or Nest equivalent)

Item	Year 1	Year 5
4 cameras at $200 each	$800	$800
Ring Protect Plus ($10/mo)	$120	$600
Cloud storage upgrade	$60	$300
Total	$980	$1,700

Local stack

Item	Year 1	Year 5
4 cameras at $90 (RTSP)	$360	$360
Server + GPU + Coral	$510	$510
6 TB HDD	$130	$130
Electricity (90W avg, 24/7)	$95	$475
Total	$1,095	$1,475

The local build crosses cost parity at month 14 and saves $225 by year 5 — and that ignores the value of feature differences. Ring still cannot tell you "the kid next door is climbing the fence." Local AI can.

What Cloud Security AI Cannot Do (And Local Can)

Custom prompts. Ask the AI specifically for what you care about ("Is the gate open?", "Is anyone wearing the uniform of a delivery driver?").
Multi-camera reasoning. Send two snapshots to the model and ask "Is the same person who just walked past the driveway now at the front door?"
License plate recall. With minicpm-v:8b you can extract plates and check them against a homeowner-defined allowlist.
Audio integration. Whisper transcribes a doorbell intercom and the LLM decides whether to wake you.
Long-term pattern recognition. Embed every event description and search them: "show me every time the mail truck stopped between 1 and 3 PM last month."

None of this is on the cloud roadmap because it would explode their inference costs. On a $400 home server it is just another automation.

Conclusion

The combination of Frigate, a Coral USB Accelerator, and a vision model in Ollama produces a security setup that is materially smarter than any cloud product on the market — and it is yours. The footage never leaves your network, the AI runs whether the ISP is up or not, and the only ongoing cost is electricity.

Start small. Two cameras, a Coral, and an existing PC running Frigate without the AI layer. That alone replaces Ring's basic functionality. Add Ollama with qwen2.5-vl:7b once the NVR is stable. The whole transition can be done over two weekends, after which the only remaining question is what to do with the $200 a year you used to spend on a Ring subscription.

Want the full home-automation picture? Our local AI + Home Assistant guide and private AI knowledge base extend this stack into voice control and document Q&A.

Local AI Home Security: Analyze Cameras Privately Without the Cloud

Want to go deeper than this article?

Local AI Home Security: Analyze Cameras Privately Without the Cloud

Quick Start: Detect a Person in 10 Minutes {#quick-start}

Table of Contents

Why Cloud Security Cameras Are Broken {#why-broken}

The Local Stack Architecture {#architecture}

Hardware: What to Buy {#hardware}

Core compute

Cameras

Frigate: Real-Time Object Detection {#frigate}

Minimal config

Custom zones

Coral vs GPU

Ollama Vision: Descriptive AI Alerts {#ollama-vision}

Models worth running

Prompt that works

Running the inference

Home Assistant Integration {#home-assistant}

Automation: AI-described front door alert

Privacy and Compliance {#privacy}

What stays local

What leaves the network

Legal considerations

Storage and Retention {#storage}

Storage math

Retention strategy

Pitfalls and Fixes {#pitfalls}

Pitfall 1: "Person" alerts on swaying trees

Pitfall 2: Ollama times out at 2 AM

Pitfall 3: 4K cameras saturate the network

Pitfall 4: AI hallucinates intruders

Pitfall 5: Storage fills up faster than expected

Cloud vs Local Cost Comparison {#cost-comparison}

Cloud (Ring or Nest equivalent)

Local stack

What Cloud Security AI Cannot Do (And Local Can)

Conclusion

Go from reading about AI to building with AI

Enjoyed this? There are 10 full courses waiting.

Local AI Master Research Team

Build Real AI on Your Machine

Want structured AI education?

Continue Your Local AI Journey

How to Install Your First Local AI Model

How to Choose the Right AI Model for Your Computer

Comments (0)

Written by Pattanaik Ramswarup

Private Home Automation Weekly

Build Real AI on Your Machine

🎓 Continue Learning

Related Guides

Grab the AI Starter Kit — career roadmap, cheat sheet, setup guide

Go from reading about AI to building with AI