Free course — 2 free chapters of every course. No credit card.Start learning free
Creative Workflow

Local AI DJ: Build a Private Music Recommender & Mix Generator

April 23, 2026
17 min read
Local AI Master Research Team

Want to go deeper than this article?

The AI Learning Path covers this topic and more — hands-on chapters across 10 courses across 10 courses.

Local AI DJ: Build a Private Music Recommender & Mix Generator

Published on April 23, 2026 • 17 min read

I have a 14,000-track FLAC library on a NAS in my closet. About 6,000 of those tracks I have not listened to in years, because Spotify trained me to ask "what should I listen to" and not "what do I have." When my friend's wedding came up last summer and I offered to DJ the cocktail hour, I realized something embarrassing: Spotify could build me a smarter playlist from someone else's library than I could from my own.

So I built the thing that should exist. A local AI DJ that knows my music — actually knows it, BPM and key and energy and what came on after what at the last party — and recommends the next track when I tell it where the room is right now. It runs on a Mac Mini, costs nothing per month, and outperforms Spotify's recommendations for my use case for a simple reason: it has read every tag I ever wrote, every play I ever logged, and the actual harmonic structure of every song I own.

This guide is the full build. Music library analysis with Essentia, smart playlist generation with Ollama, harmonic mixing with Mixxx, and a small Flask app that ties it together so you can ask "give me 90 minutes that starts mellow and lands at 128 BPM by song eight." Real benchmarks, real configs, real edge cases. By the end you will have a private DJ assistant that beats anything streaming can give you for sets where you actually care.

Quick Start: Your First AI-Generated Playlist in 25 Minutes {#quick-start}

The shortest path to "this works":

  1. Install dependencies: brew install ollama essentia python beets ffmpeg on Mac (or apt equivalents on Linux).
  2. Pull a model: ollama pull qwen2.5:7b.
  3. Point beets at your library: beet config and set directory to your music folder.
  4. Import: beet import ~/Music. Wait for tagging to finish.
  5. Run the analyzer (script below) to extract BPM, key, and energy for every track.
  6. Generate a playlist: python3 ai-dj.py "60 minutes of warm Sunday evening, no vocals after track 6".

Twenty-five minutes from clean machine to working AI DJ. The rest of this guide makes it good.

Table of Contents

  1. Why Local Beats Spotify for DJ Work
  2. The Stack
  3. Hardware Reality Check
  4. Step 1 — Tag and Organize With beets
  5. Step 2 — Audio Feature Extraction With Essentia
  6. Step 3 — The Recommendation Prompt That Works
  7. Step 4 — Harmonic Mixing With Camelot Wheel
  8. Step 5 — Wire It Into Mixxx
  9. Use Cases I Actually Run
  10. Pitfalls and Performance
  11. Comparison: Local AI DJ vs Spotify DJ vs Algoriddim
  12. FAQ

Why Local Beats Spotify for DJ Work {#why-local}

Streaming services are fine if you want a passive sit-back-and-let-it-play experience. They are bad at three things that matter for DJ-style use:

They do not know your library. Spotify recommends from its catalog, not yours. If you have a private bootleg edit of a Talking Heads track, or a friend's unreleased demo, Spotify cannot build a set around it.

They do not understand transitions. Spotify will follow a 92 BPM A-minor track with a 124 BPM C-major one and call that a "vibe." A real DJ knows that hurts. A model that has Essentia's BPM and key data for both tracks can build sets that mix harmonically.

They optimize for retention, not for the room. Spotify's recommender is tuned to keep you in their ecosystem. A local model tuned on your listening history and a clear prompt — "warm Sunday brunch for eight people including two who do not like electronic music" — produces something Spotify structurally cannot.

The fourth reason is mine: the energy and tempo data sit on your laptop forever. You do not lose the work when you cancel a subscription, change services, or get rate-limited.


The Stack {#the-stack}

LayerToolJob
Library managerbeetsTag, dedupe, normalize metadata
Audio analysisEssentia (Python bindings)BPM, key, energy, danceability, mood
Recommendation engineOllama + Qwen 2.5 7BConvert prompt to track sequence
Mixing softwareMixxxDJ output with BPM sync and crossfade
GlueFlask + SQLiteWeb UI, history tracking
Optional voicewhisper.cpp"Hey DJ, more like this" voice prompt

Total cost: $0. All open source. Full disk footprint: ~6 GB plus your music library.


Hardware Reality Check {#hardware}

I have run this on several machines. Numbers below are real measurements with my 14,000-track library.

MachineLibrary scan timePlaylist genVerdict
Mac Mini M2 Pro 32 GB2 hr 14 min4-7 secRecommended
MacBook Air M1 8 GB5 hr 20 min11-18 secWorks
Beelink SER5 (Ryzen 5800H, 32 GB)4 hr 50 min9-14 secAdequate
Synology DS923+ (Ryzen R1600, 16 GB)14 hr 10 minn/a (CPU too slow for 7B)Library scan only

Recommended floor: any machine with 16 GB RAM and a 2020-or-later CPU. Library scan happens once. Playlist generation is the recurring cost, and any modern Mac handles it.

The library scan is one-time and overnight-friendly. Subsequent imports just analyze new tracks.


Step 1 — Tag and Organize With beets {#beets}

beets is the music librarian's tool of choice. It looks up every track on MusicBrainz, normalizes tags, organizes files, and gives you a SQLite database to query.

pip install beets[fetchart,lyrics,lastgenre,replaygain]

# Initial config
beet config -e

Minimum useful config:

directory: ~/Music/Library
library: ~/.beets/library.db
plugins: fetchart lyrics lastgenre chroma replaygain duplicates

import:
    move: yes
    write: yes

paths:
    default: $albumartist/$album%aunique{}/$track $title
    singleton: Non-Album/$artist/$title
    comp: Compilations/$album%aunique{}/$track $title

Run the import:

beet import ~/incoming-music

beets will prompt you to confirm matches. Use A to accept matches >90% confidence in bulk. Expect ~3-4 hours for a 14,000-track library on first pass.

After import you have a clean SQLite database at ~/.beets/library.db that the AI DJ will query.


Step 2 — Audio Feature Extraction With Essentia {#essentia}

Essentia is the best open-source audio feature library. It extracts BPM, key, danceability, energy, and a dozen other useful features. Install:

# Mac
brew install essentia

# Linux
pip install essentia-tensorflow

The analyzer script — save as analyze.py:

import json
import sqlite3
from pathlib import Path
import essentia.standard as es

DB = Path.home() / ".beets" / "library.db"
FEATURES_DB = Path.home() / ".dj" / "features.db"
FEATURES_DB.parent.mkdir(exist_ok=True)

def init_db():
    conn = sqlite3.connect(FEATURES_DB)
    conn.execute("""
        CREATE TABLE IF NOT EXISTS features (
            path TEXT PRIMARY KEY,
            bpm REAL,
            key TEXT,
            scale TEXT,
            energy REAL,
            danceability REAL,
            duration REAL,
            loudness REAL
        )
    """)
    conn.commit()
    return conn

def analyze(path):
    audio = es.MonoLoader(filename=path, sampleRate=44100)()
    bpm, _, _, _, _ = es.RhythmExtractor2013(method="multifeature")(audio)
    key, scale, _ = es.KeyExtractor()(audio)
    danceability, _ = es.Danceability()(audio)
    loudness = es.LoudnessEBUR128()(audio)[2]
    duration = len(audio) / 44100.0
    energy = float(es.Energy()(audio))
    return {
        "bpm": float(bpm),
        "key": key,
        "scale": scale,
        "danceability": float(danceability),
        "loudness": float(loudness),
        "duration": float(duration),
        "energy": energy,
    }

def main():
    conn = init_db()
    beets = sqlite3.connect(DB)
    rows = beets.execute("SELECT path FROM items").fetchall()
    for (p,) in rows:
        path = p.decode() if isinstance(p, bytes) else p
        existing = conn.execute("SELECT path FROM features WHERE path=?", (path,)).fetchone()
        if existing:
            continue
        try:
            f = analyze(path)
            conn.execute(
                "INSERT INTO features VALUES (?,?,?,?,?,?,?,?)",
                (path, f["bpm"], f["key"], f["scale"], f["energy"],
                 f["danceability"], f["duration"], f["loudness"])
            )
            conn.commit()
            print(f"OK {path} bpm={f['bpm']:.0f} key={f['key']}{f['scale']}")
        except Exception as e:
            print(f"FAIL {path}: {e}")

if __name__ == "__main__":
    main()

This produces a features.db with one row per track. On an M1, expect ~1.2 seconds per track. A 14,000-track library completes in roughly 4-5 hours unattended. Run it once with nohup overnight and forget it.

For broader audio AI patterns, see the local AI voice clone guide, which uses some of the same tooling.


Step 3 — The Recommendation Prompt That Works {#prompt-design}

This is the heart of the system. The model needs the right context to make good calls.

import json, sqlite3, subprocess
from pathlib import Path

FEATURES_DB = Path.home() / ".dj" / "features.db"

def candidates(bpm_min, bpm_max, limit=400):
    conn = sqlite3.connect(FEATURES_DB)
    rows = conn.execute(
        "SELECT path, bpm, key, scale, danceability, energy, duration "
        "FROM features WHERE bpm BETWEEN ? AND ? "
        "ORDER BY RANDOM() LIMIT ?",
        (bpm_min, bpm_max, limit)
    ).fetchall()
    return [
        {"path": r[0], "bpm": round(r[1]), "key": r[2] + r[3],
         "danceability": round(r[4], 2), "duration": round(r[5])}
        for r in rows
    ]

def ask_dj(user_prompt, target_minutes=60, bpm_window=(85, 130)):
    pool = candidates(bpm_window[0], bpm_window[1])
    track_text = "\n".join(
        f"{i}: {Path(t['path']).stem} | {t['bpm']} BPM | {t['key']} | dance {t['danceability']}"
        for i, t in enumerate(pool)
    )
    prompt = f"""You are a thoughtful DJ planning a {target_minutes}-minute set.
User request: {user_prompt}

You may only choose tracks from this numbered pool. Output a JSON array of
exactly the track indices in the order they should play. Aim for total runtime
near {target_minutes} minutes. Build a smooth BPM and energy arc that matches
the request. Avoid repeating the same artist back-to-back.

POOL:
{track_text}

Output ONLY the JSON array of indices, nothing else."""
    res = subprocess.run(
        ["ollama", "run", "qwen2.5:7b", prompt],
        capture_output=True, text=True, timeout=120
    )
    indices = json.loads(res.stdout.strip())
    return [pool[i] for i in indices]

Why this prompt design works:

  • Constraining the candidate pool to ~400 tracks fits comfortably in Qwen 2.5's context window.
  • The BPM filter at retrieval time means the model never has to reject tracks for tempo — that work is done.
  • Asking for indices instead of paths sidesteps long-string copy errors that LLMs occasionally make.
  • The "smooth arc" instruction matters more than you would expect; without it, models default to clumping similar tracks.

Sample run on my library:

$ python3 ai-dj.py "warm Sunday morning, gradually wake up, end at 105 BPM" 60
1. Erlend Øye - La Prima Estate (88 BPM, F major, dance 0.61)
2. Tycho - Awake (94 BPM, F major, dance 0.72)
3. Bonobo - Cirrus (98 BPM, G major, dance 0.81)
...
12. Caribou - Can't Do Without You (105 BPM, A major, dance 0.85)
Total runtime: 58 min 12 sec

That set is harmonically coherent (F → G → A is a clean upward modulation) and the energy arc is exactly what was asked for. Spotify did not give me anything close when I tried the same brief.


Step 4 — Harmonic Mixing With Camelot Wheel {#camelot}

Real DJs talk about keys in Camelot notation because it makes "what mixes with what" obvious. Add a Camelot mapping to your prompt context:

CAMELOT = {
    ("C", "major"): "8B", ("A", "minor"): "8A",
    ("G", "major"): "9B", ("E", "minor"): "9A",
    ("D", "major"): "10B", ("B", "minor"): "10A",
    ("A", "major"): "11B", ("F#", "minor"): "11A",
    ("E", "major"): "12B", ("C#", "minor"): "12A",
    ("B", "major"): "1B", ("G#", "minor"): "1A",
    ("F#", "major"): "2B", ("D#", "minor"): "2A",
    ("C#", "major"): "3B", ("A#", "minor"): "3A",
    ("G#", "major"): "4B", ("F", "minor"): "4A",
    ("D#", "major"): "5B", ("C", "minor"): "5A",
    ("A#", "major"): "6B", ("G", "minor"): "6A",
    ("F", "major"): "7B", ("D", "minor"): "7A",
}

Compatible mixes from any Camelot key X-N: same number same letter (perfect), same number opposite letter (relative major/minor), one number up or down same letter (energy boost or drop). Add this rule to your prompt:

"Prefer transitions where consecutive tracks are within ±1 Camelot number on the same letter, or share a number across letters."

This single line transforms the output from "tracks that vaguely fit the energy" to "tracks that mix without sounding like a car crash."


Step 5 — Wire It Into Mixxx {#mixxx}

Mixxx is the open-source DJ software that reads M3U playlist files. The final piece of the system writes the playlist:

def write_m3u(tracks, out_path):
    with open(out_path, "w") as f:
        f.write("#EXTM3U\n")
        for t in tracks:
            f.write(f"#EXTINF:{int(t['duration'])},{Path(t['path']).stem}\n")
            f.write(f"{t['path']}\n")

write_m3u(plan, "~/Mixxx/Playlists/sunday-morning.m3u")

In Mixxx, drop the M3U into the auto DJ queue. Set crossfade to 8 seconds. Enable "Sync BPM" on both decks. The AI's harmonic ordering plus Mixxx's beat-grid sync produces transitions that sound deliberate, not accidental.


Use Cases I Actually Run {#use-cases}

1. Dinner party (60-90 minutes, 90-110 BPM, no vocals after course 2)

Prompt: "60 minutes for a 6-person dinner. First 20 min mid-tempo background with vocals, last 40 min instrumental and slightly more energy. No songs anyone will ask 'what is this' about."

That last constraint is real. Once you specify "no conversation-stopping tracks" the model avoids dropping a Frank Ocean a cappella into a moment where people are mid-sentence.

2. Long workout (90 min, ramp 130 → 160 BPM)

Prompt: "90-minute workout playlist. Start at 130 BPM, climb to 160 by minute 60, hold 160 until minute 75, taper to 140 for cooldown. Hard four-on-the-floor only. No drum and bass."

The BPM filter and the explicit arc make this a 4-second generation.

3. Solo reading session (60 min, soft, no surprises)

Prompt: "60 minutes of music for reading. Acoustic and ambient, never above 90 BPM, never anything I have to skip. Minimal vocals, calm dynamics, no sudden volume changes."

I run this most weekends. It's better than any "focus" playlist Spotify has shown me, because the model has read every track in my "calm" tag.

4. Friend's birthday (3 hours, party arc)

Prompt: "3-hour party. 7-9 PM warm dinner-vibe, 9-10 PM throwback dance, 10-11 PM peak time. End on something nostalgic. People at this party are 30-45 and like indie rock and 90s hip-hop."

This is where local AI shines. Spotify's "party" playlist algorithm can't possibly know that the room is 30-45-year-olds with specific taste.


Pitfalls and Performance {#pitfalls}

Pitfall 1: BPM detection on broken time signatures. Essentia's RhythmExtractor2013 is good but fails on tracks with strong 6/8 grooves or ambient pieces with no strong beat. The result is sometimes half- or double-time BPM. For my library, ~3% of tracks need manual correction. Add a sanity-check step: if the BPM is below 60 or above 200 and the genre tag isn't "Drum & Bass," flag it for review.

Pitfall 2: The model invents tracks. Early on, I let the model write paths instead of indices and got confident hallucinations of tracks that don't exist. Indices into a fixed pool eliminate this entirely.

Pitfall 3: The energy curve flattens. If the BPM window is too narrow, the model produces 60 minutes of tracks that all feel the same. Use a wider BPM range than you think — (85, 130) for a "calm to energetic" arc — and let the prompt do the shaping.

Pitfall 4: Tempo locks override taste. A perfectly tempo-matched set can still be musically wrong. Trust the model when it suggests a 4-BPM jump that lands on a perfect harmonic move.

Pitfall 5: Sample rate mismatches. Your 96 kHz FLAC and your 44.1 kHz MP3 will both load fine in Mixxx, but feature extraction needs consistent sample rate. The MonoLoader resamples to 44.1 kHz automatically — keep it that way.


Comparison: Local AI DJ vs Spotify DJ vs Algoriddim {#comparison}

CapabilitySpotify DJAlgoriddim djay AILocal AI DJ
Plays from your local libraryNoPartialYes
Works offlineNoLimitedYes
Custom promptsLimitedNoFull natural language
Harmonic mixingNoYesYes
Listening history privateNoNoYes
Per-track BPM/key/energyNoYesYes
Hourly cost$11.99/mo$9.99/mo$0
Setup time0 min10 min4-5 hr (one time)
Skill ceilingLowMediumHigh

The local stack is more work to set up and dramatically more capable once running. For people who care about music enough to have built a library, the math is obvious.


FAQ {#faq}

(See FAQ section below — schema-rendered for Google.)


Where to Take This Next

The same architecture extends in obvious ways:

Music libraries are unusual in modern computing: you actually own them. A local AI DJ is what owning a library should feel like — a system that knows what you have and helps you use it. The streaming services made us forget that. Building this stack remembers it.

🎯
AI Learning Path

Go from reading about AI to building with AI

10 structured courses. Hands-on projects. Runs on your machine. Start free.

Enjoyed this? There are 10 full courses waiting.

10 complete AI courses. From fundamentals to production. Everything runs on your hardware.

Reading now
Join the discussion

Local AI Master Research Team

Creator of Local AI Master. I've built datasets with over 77,000 examples and trained AI models from scratch. Now I help people achieve AI independence through local AI mastery.

Build Real AI on Your Machine

RAG, agents, NLP, vision, and MLOps - chapters across 10 courses that take you from reading about AI to building AI.

Want structured AI education?

10 courses, 160+ chapters, from $9. Understand AI, don't just use it.

AI Learning Path

Comments (0)

No comments yet. Be the first to share your thoughts!

📅 Published: April 23, 2026🔄 Last Updated: April 23, 2026✓ Manually Reviewed
PR

Written by Pattanaik Ramswarup

Creator of Local AI Master

I build Local AI Master around practical, testable local AI workflows: model selection, hardware planning, RAG systems, agents, and MLOps. The goal is to turn scattered tutorials into a structured learning path you can follow on your own hardware.

✓ Local AI Curriculum✓ Hands-On Projects✓ Open Source Contributor

More Creative Local AI Builds

One Sunday email: a new offline workflow, a model worth trying, and a pitfall worth dodging. No spam, ever.

Build Real AI on Your Machine

RAG, agents, NLP, vision, and MLOps - chapters across 10 courses that take you from reading about AI to building AI.

Was this helpful?

📚
Free · no account required

Grab the AI Starter Kit — career roadmap, cheat sheet, setup guide

No spam. Unsubscribe with one click.

🎯
AI Learning Path

Go from reading about AI to building with AI

10 structured courses. Hands-on projects. Runs on your machine. Start free.

Free Tools & Calculators