★ Reading this for free? Get 20 structured AI courses + per-chapter AI tutor — the first chapter of every course free, no card.Start free in 30 seconds
AI Tools

Piper TTS Setup 2026: Fast Offline Voices on Any Hardware

June 20, 2026
10 min read
Local AI Master Research Team

Want to go deeper than this article?

Free account unlocks the first chapter of all 20 courses — RAG, agents, MCP, voice AI, MLOps, real GitHub repos.

📚AI Learning Path

Voice working locally? Build the whole pipeline. Whisper, TTS, and voice cloning wired into real projects — hands-on courses. First chapter free, no card.

Start free
Or own it for life — Lifetime $149, pay once

Piper is the fastest fully-offline neural text-to-speech engine you can run in 2026: install it with one command (pip install piper-tts), it speaks in real time on a Raspberry Pi 5 with no GPU, ships 30+ languages and 100+ downloadable voices, and is the default TTS in Home Assistant. Active development now lives at the Open Home Foundation's OHF-Voice/piper1-gpl repo (latest release v1.4.2, April 2026), built on the VITS architecture and exported to ONNX, with embedded espeak-ng for phonemization. It runs from a CLI, a Python API, a tiny web server, or C/C++ — all locally, with nothing leaving your machine.

If you want a private voice for a smart-home assistant, a screen reader, a robotics project, or just text-to-speech that doesn't phone home, Piper is the default answer. It is small (voices are tens of megabytes), CPU-only, and quick enough to feel instant even on a single-board computer. Below is how to install it on every OS, pick a voice, drive it from Python, and wire it into Home Assistant.

What is Piper TTS, and why is it the go-to offline voice?

Piper is a neural TTS system originally built for the Rhasspy voice-assistant project and now maintained by the Open Home Foundation. Each voice is a VITS model exported to the ONNX runtime, paired with a small JSON config; espeak-ng handles turning text into phonemes before the neural network synthesizes audio. That combination is what makes it both natural-sounding and fast on modest hardware.

Three things make Piper the default choice for local voice:

  1. It is genuinely fast on CPU. Piper hits real-time synthesis on a Raspberry Pi 5 with no GPU, and roughly an order of magnitude faster than real time on a modern desktop CPU. There is no CUDA dependency to fight.
  2. It is tiny. A medium voice is a single ONNX file in the tens of megabytes, and inference uses a small RAM footprint — small enough for embedded boards and containers.
  3. It is everywhere downstream. Piper is the built-in TTS in Home Assistant (via the Wyoming protocol), and it is also used by the NVDA screen reader and LocalAI, so the voices you learn here transfer directly.

A quick licensing note, because it changed and matters for projects: the original rhasspy/piper repository (MIT-licensed) was archived read-only in October 2025, and active development moved to OHF-Voice/piper1-gpl, which is GPL-3.0. If you need a permissive MIT license specifically, you are pinning the old archived code; the current, maintained Piper is GPL-3.0. Check the license terms for your use case before shipping.

Reading articles is good. Building is better.

Free account = 20+ free chapters across 20 courses, with a per-chapter AI tutor. No card. Cancel anytime if you ever upgrade.

How do I install Piper TTS? (Linux, macOS, Windows)

The simplest route on any desktop OS is the Python package, which installs the engine and the CLI in one step:

pip install piper-tts

That gives you the piper command plus the Python module. The package bundles the ONNX runtime and espeak-ng phonemization, so on most systems there are no extra system dependencies to chase. (On some minimal Linux images you may still need espeak-ng data files from your package manager.)

Once installed, download a voice and synthesize a WAV. Voices are model files; here we grab the default English voice and speak a line:

# Download a voice (model + config) into the current folder
python3 -m piper.download_voices en_US-lessac-medium

# Synthesize speech to a WAV file
echo 'Hello from Piper, running entirely on my own machine.' \
  | piper -m en_US-lessac-medium -f hello.wav

On macOS and Windows the exact same pip install piper-tts and CLI calls work; the only difference is how you play the resulting WAV (any media player will do). If you prefer not to install Python at all, the project also offers prebuilt binaries and language bindings — but for almost everyone, the pip package is the shortest path.

How do I run Piper on a Raspberry Pi (real-time, no GPU)?

This is Piper's home turf. The install is identical to desktop — pip install piper-tts — and a Raspberry Pi 5 synthesizes a medium-quality voice in real time on CPU alone, with no accelerator. A Pi 4 also works but is noticeably slower on medium voices; if you are on a Pi 4 and latency matters, drop to a low or x_low voice.

# On Raspberry Pi OS (64-bit recommended)
pip install piper-tts
python3 -m piper.download_voices en_US-lessac-low

echo 'The smart home is listening, locally.' \
  | piper -m en_US-lessac-low -f /tmp/say.wav
aplay /tmp/say.wav

First-hand framing (approximate): in our own informal testing on Pi-class ARM CPUs, a medium voice synthesizes a short sentence in well under the time it takes to say it — comfortably real-time on a Pi 5, and the real-time factor improves further with the low/x_low tiers. Treat these as ballpark, single-board figures rather than a controlled benchmark; exact numbers depend on your specific board, voice quality, and sentence length. The headline holds: no GPU is needed, and memory use stays small enough to run alongside other services.

For a fuller offline-audio stack — speech recognition feeding a local model feeding Piper — pair this with a local Whisper speech-to-text setup so the whole listen-think-speak loop stays on your hardware.

Which Piper voice and quality level should I pick?

Voices follow a consistent naming scheme: <language>_<REGION>-<name>-<quality> — for example en_US-lessac-medium. The name comes from the training dataset or speaker, and the quality suffix is the main knob you tune for the speed-versus-fidelity trade-off. Here is how the tiers compare:

Quality tierSample rateRelative size & speedBest for
x_low16 kHzSmallest, fastestPi 4 / very constrained boards, wake-word style prompts
low16 kHzSmall, fastRaspberry Pi, low-latency assistants
medium22.05 kHzBalanced (the common default)Most desktops, Pi 5, Home Assistant default
high22.05 kHzLargest, best fidelityNarration, content where quality > latency

Piper publishes 100+ voices across 30+ languages — English (multiple regional accents), Spanish, German, French, Italian, Dutch, Russian, Chinese, and many more — all downloadable as individual ONNX models from the official Hugging Face voice collection. You only download the voices you actually use, which keeps disk footprint tiny. The default English voice, and the one Home Assistant ships with, is en_US-lessac-medium.

You can browse and preview every voice on the official Piper voice samples page before downloading — listen first, then pull the exact model name.

Reading articles is good. Building is better.

Free account = 20+ free chapters across 20 courses, with a per-chapter AI tutor. No card. Cancel anytime if you ever upgrade.

How do I use Piper from Python and the CLI?

Both interfaces are first-class. The CLI is ideal for scripts and pipes; the Python API is what you embed in an app.

CLI — read text on stdin, write a WAV (or stream raw audio):

# To a file
echo 'Generating audio from the command line.' \
  | piper -m en_US-lessac-medium -f out.wav

# Stream raw 22.05kHz PCM straight to an audio player
echo 'Low-latency streaming output.' \
  | piper -m en_US-lessac-medium --output-raw \
  | aplay -r 22050 -f S16_LE -t raw -

Python — load a voice once and synthesize repeatedly:

import wave
from piper import PiperVoice

voice = PiperVoice.load("en_US-lessac-medium.onnx")

with wave.open("python_out.wav", "wb") as wav_file:
    voice.synthesize_wav(
        "This sentence was synthesized from the Piper Python API.",
        wav_file,
    )

Beyond these, Piper also exposes a lightweight HTTP web server and C/C++ bindings, so you can call it from a microservice or a native app without shelling out. For a desktop-app voice or a custom assistant UI, the Python API plus a loaded voice object is the pattern to reach for.

How do I add Piper to Home Assistant?

Piper is the built-in local TTS for Home Assistant, exposed through the Wyoming protocol. The clean path on Home Assistant OS or Supervised installs:

  1. Go to Settings → Add-ons → Add-on Store, search for Piper, and install the official add-on.
  2. Start it. The add-on implements Wyoming and is auto-discovered by the Wyoming integration in Home Assistant — no manual host/port wiring needed in the typical case.
  3. In the add-on configuration you can choose which voices to download (it defaults to en_US-lessac-medium), which saves disk space versus pulling everything.
  4. Use Piper as the TTS engine in your Assist pipeline, alongside a local speech-to-text and your chosen conversation agent.

That gives you a fully private voice response path: your smart home speaks without sending text to any cloud service. If you are building the rest of that private stack, our guide to a local AI + Home Assistant setup walks through the wake-word, STT, intent, and TTS pieces end to end.

How does Piper compare to other local TTS engines?

Piper optimizes for speed, size, and offline reliability rather than studio-grade expressiveness or voice cloning. If your priority is real-time response on small hardware, Piper wins; if you need expressive prosody or cloning a specific voice, heavier engines are a better fit.

  • Want a slightly higher-fidelity but still tiny model? Compare with our Kokoro TTS local setup — an 82M-parameter open voice model that trades a bit of speed for quality.
  • Need to clone a specific voice in many languages? That is a different job; see our XTTS v2 voice cloning guide for multilingual cloning (a much larger model, not real-time on a Pi).

The rule of thumb: Piper for fast, embedded, always-on voice; cloning-capable engines like XTTS for content where a specific identity or maximum expressiveness matters.

Key Takeaways

  1. Install in one line: pip install piper-tts on Linux, macOS, Windows, and Raspberry Pi. The current, maintained project is OHF-Voice/piper1-gpl (latest v1.4.2, April 2026).
  2. Real-time, no GPU: Piper synthesizes in real time on a Raspberry Pi 5 on CPU alone, and roughly 10× real-time on a modern desktop CPU, with a small RAM footprint.
  3. 30+ languages, 100+ voices: all downloadable ONNX models named <language>_<REGION>-<name>-<quality>; tiers are x_low/low (16 kHz) and medium/high (22.05 kHz).
  4. Native Home Assistant TTS: install the official Piper add-on; it speaks over the Wyoming protocol and is auto-discovered, with en_US-lessac-medium as the default voice.
  5. License changed: the old MIT rhasspy/piper repo is archived read-only (Oct 2025); active Piper is GPL-3.0. Verify license fit before shipping.

Next Steps

🎯
AI Learning Path

Voice working locally? Build the whole pipeline.

Whisper, TTS, and voice cloning wired into real projects — hands-on courses. First chapter free, no card.

Or own it for life — Lifetime $149 $599, pay once

Liked this? 20 full AI courses are waiting.

From fundamentals to RAG, agents, MCP servers, voice AI, and production deployment with real GitHub repos. First chapter free, every course.

Reading now
Join the discussion

Local AI Master Research Team

Creator of Local AI Master. I've built datasets with over 77,000 examples and trained AI models from scratch. Now I help people achieve AI independence through local AI mastery.

Build Real AI on Your Machine

RAG, agents, NLP, vision, and MLOps - chapters across 20 courses that take you from reading about AI to building AI.

Want structured AI education?

20 courses, 495+ chapters, from $9. Understand AI, don't just use it.

AI Learning Path
More on Local Voice & Speech
See the full Coqui TTS & Local Voice AI guide.

Comments (0)

No comments yet. Be the first to share your thoughts!

📅 Published: June 20, 2026🔄 Last Updated: June 20, 2026✓ Manually Reviewed

Ready to Go Beyond Tutorials?

20 structured courses with hands-on chapters - build RAG chatbots, AI agents, and ML pipelines on your own hardware.

🎯
AI Learning Path

Go from reading about AI to building with AI

20 structured courses. Hands-on projects. Runs on your machine. Start free.

Or own it for life — Lifetime $149 $599, pay once

Was this helpful?

LM

Written by the Local AI Master Team

The team behind Local AI Master

We build Local AI Master around practical, testable local AI workflows: model selection, hardware planning, RAG systems, agents, and MLOps. The goal is to turn scattered tutorials into a structured learning path you can follow on your own hardware.

✓ Local AI Curriculum✓ Hands-On Projects✓ Open Source Contributor
📚
Free · no account required

Grab the AI Starter Kit — career roadmap, cheat sheet, setup guide

No spam. Unsubscribe with one click.

🎯
AI Learning Path

Voice working locally? Build the whole pipeline.

Whisper, TTS, and voice cloning wired into real projects — hands-on courses. First chapter free, no card.

Or own it for life — Lifetime $149 $599, pay once
Free Tools & Calculators