Tools

AnythingLLM Setup Guide: All-in-One Local AI with RAG

February 4, 2026
18 min read
Local AI Master Research Team
🎁 4 PDFs included
Newsletter

Before we dive deeper...

Get your free AI Starter Kit

Join 12,000+ developers. Instant download: Career Roadmap + Fundamentals Cheat Sheets.

No spam, everUnsubscribe anytime
12,000+ downloads

AnythingLLM at a Glance

Key Stats:
• 53,000+ GitHub stars
• 30+ LLM providers supported
• 9+ vector database options
• 100% offline capable

Core Features:
• Built-in RAG for document chat
• AI agents with web/SQL/files
• No-code agent flow builder
• MCP compatibility

What is AnythingLLM?

AnythingLLM is an open-source, all-in-one Desktop and Docker AI application developed by Mintplex Labs. With 53,000+ GitHub stars, it's the most comprehensive local AI platform for document chat, RAG, and AI agents.

AnythingLLM is designed to be private by default—everything is stored and run locally on your machine. It acts as a bridge between your proprietary knowledge base and modern AI models, enabling you to build custom AI systems with zero data leakage risk.

Why AnythingLLM?

  • LLM Flexibility: Support for 30+ providers including OpenAI, Anthropic, Google, and local models via Ollama
  • Full-Stack RAG: Turn any document into searchable context for AI conversations
  • AI Agents: Built-in agent capabilities with web search, SQL, charts, and custom tools
  • No-Code Builder: Visual interface to create agentic workflows without programming
  • Multi-User Support: Role-based permissions for teams (Docker deployment)
  • 100% Offline: Works without internet when using local models
  • Cross-Platform: Available for macOS, Windows, Linux, and Docker

How AnythingLLM Works

  1. Documents are uploaded and converted to text
  2. Chunks are created with configurable overlap
  3. Embeddings are generated using your chosen model
  4. Vectors are stored in the vector database
  5. Queries are embedded and matched using cosine similarity
  6. Context is provided to the LLM for accurate responses

Installation Options

Desktop Application (Single-User)

The simplest way to get started—perfect for personal use.

Step 1: Download

Visit https://anythingllm.com and download for your platform:

  • macOS (Apple Silicon and Intel)
  • Windows
  • Linux (AppImage)

Step 2: Install and Launch

Run the installer and launch AnythingLLM.

Step 3: Choose Your LLM

On first boot, you can:

  • Download a built-in model (Llama-3, Phi-3, etc.)
  • Connect to Ollama
  • Configure a cloud provider

The desktop app includes a built-in LLM engine that runs models locally on CPU/GPU without external dependencies.

Docker Installation (Multi-User)

Docker deployment is ideal for servers, teams, and production environments.

System Requirements:

  • Minimum 2GB RAM
  • Minimum 10GB disk storage
  • Docker installed
  • yarn and node (for building from source)

Quick Setup:

# Pull the latest image
docker pull mintplexlabs/anythingllm:master

# Create storage directory
mkdir -p $HOME/anythingllm

# Create environment file
touch "$HOME/anythingllm/.env"

# Run the container
docker run -d -p 3001:3001 \
  --cap-add SYS_ADMIN \
  -v $HOME/anythingllm:/app/server/storage \
  -e STORAGE_DIR="/app/server/storage" \
  mintplexlabs/anythingllm:master

Access the application at http://localhost:3001

Building from Source:

# Clone the repository
git clone https://github.com/Mintplex-Labs/anything-llm.git
cd anything-llm

# Create SQLite database file
touch server/storage/anythingllm.db

# Build and run with docker-compose
docker compose up -d

Docker Notes:

  • UID and GID are set to 1000 by default
  • The --cap-add SYS_ADMIN flag is required for web scraping functionality
  • Persistent storage is mounted to preserve data across container restarts

Cloud Deployment Options

AnythingLLM offers one-click deployment templates for:

  • Railway
  • Render
  • AWS, Google Cloud, Azure
  • Custom cloud infrastructure

Connecting to Ollama (Local LLMs)

The most popular setup is AnythingLLM with Ollama for 100% offline, private AI.

Step 1: Install and Start Ollama

# Download from https://ollama.com/download
# Or install via Homebrew (macOS)
brew install ollama

# Start the Ollama server
ollama serve

Step 2: Pull Models

# For high-end systems (64GB+ RAM)
ollama pull gemma3:27b
ollama pull llama3.1:70b

# For mid-range systems (16-32GB RAM)
ollama pull llama3.1:8b
ollama pull mistral:7b

# For lighter systems (8-16GB RAM)
ollama pull gemma3:4b
ollama pull phi3:mini

# For embeddings (required for RAG)
ollama pull nomic-embed-text

Step 3: Configure AnythingLLM LLM Provider

  1. Open AnythingLLM
  2. Go to Settings (gear icon)
  3. Select LLM Provider
  4. Choose Ollama
  5. Set the Ollama URL:
DeploymentURL
Desktop Apphttp://127.0.0.1:11434
Docker (Windows/macOS)http://host.docker.internal:11434
Docker (Linux)http://172.17.0.1:11434
  1. Select your model from the dropdown
  2. Click Save

Step 4: Configure Embeddings

  1. Go to Settings > Embedder
  2. Select Ollama
  3. Choose nomic-embed-text
  4. Set the same URL as above
  5. Click Save

Note: If nomic-embed-text isn't available, run:

ollama pull nomic-embed-text

Alternative: Built-in LLM Engine

AnythingLLM Desktop includes a built-in LLM engine that can download and run models directly:

  1. Go to Settings > LLM Provider
  2. Select AnythingLLM
  3. Choose a model to download (Llama-3, Phi-3, etc.)
  4. Wait for download to complete
  5. Start chatting

This option requires no external setup—perfect for beginners.


LLM Provider Options

AnythingLLM supports 30+ LLM providers for maximum flexibility.

Cloud Providers

ProviderModelsBest For
OpenAIGPT-4, GPT-4o, GPT-4o-miniBest quality, most popular
AnthropicClaude 3.5 Sonnet, Opus, HaikuLong context, safety
Azure OpenAIGPT models on AzureEnterprise, compliance
GoogleGemini Pro, Gemini UltraMultimodal, long context
AWS BedrockVariousAWS ecosystem
Mistral AIMistral Large, CodestralCode, European data
GroqLlama, MixtralUltra-fast inference
CohereCommand R, Command R+Enterprise RAG
Perplexity AIOnline modelsReal-time web data
Together AIOpen-source modelsModel variety
OpenRouter100+ modelsModel aggregation
Hugging FaceCustom modelsResearch, fine-tuned

Local Providers

ProviderSetup ComplexityBest For
Built-in EngineNoneBeginners, quick start
OllamaLowMost popular, versatile
LM StudioLowGUI-based, visual users
LocalAIMediumDocker deployments
KoboldCPPLowGGML models

Generic OpenAI Wrapper

For any OpenAI-compatible API not explicitly integrated:

  1. Go to Settings > LLM Provider
  2. Select OpenAI (Generic)
  3. Enter the API endpoint and key
  4. Works with vLLM, text-generation-inference, and other OpenAI-compatible servers

Document Processing and RAG

How RAG Works in AnythingLLM

RAG (Retrieval-Augmented Generation) enhances LLM responses with your documents:

  1. Upload: Add documents to a workspace
  2. Process: Documents are converted to text
  3. Chunk: Text is split into segments (configurable size and overlap)
  4. Embed: Chunks are converted to vectors using your embedding model
  5. Store: Vectors are saved in your vector database
  6. Query: Your questions are embedded and matched to relevant chunks
  7. Respond: The LLM receives matched context and generates accurate answers

Supported Document Formats

CategoryFormats
DocumentsPDF, DOCX, TXT, Markdown, RTF
DataCSV, Excel, Spreadsheets
CodePython, JavaScript, TypeScript, and more
MediaAudio files (with transcription)
WebURLs, websites (via scraping)

Upload Process

  1. Create a Workspace (or use existing)
  2. Click Upload Documents button
  3. Drag and drop files or paste URLs
  4. Monitor embedding progress
  5. Start chatting with your documents

Data Connectors (Docker/Cloud)

ConnectorDescription
GitHubImport entire repositories
ConfluenceImport Confluence pages
YouTubeExtract video transcripts
Website CrawlerRecursively scrape websites
Browser ExtensionSend web pages directly to workspaces

Embedding Models

Local Embeddings:

  • Built-in embedder (runs on CPU, ~25MB download)
  • Ollama (nomic-embed-text recommended)
  • LocalAI

Cloud Embeddings:

  • OpenAI (text-embedding-3-small, text-embedding-3-large)
  • Azure OpenAI
  • Cohere
  • Voyage AI

Important: Embedding models are set system-wide. Changing embedders requires re-embedding all documents.


Vector Database Options

AnythingLLM supports multiple vector databases for different use cases.

Built-in: LanceDB (Default)

LanceDB is an embedded, serverless vector database that runs directly inside AnythingLLM—no separate server required.

Advantages:

  • Zero configuration
  • Scales to millions of vectors on disk
  • Incredible retrieval speed
  • Native reranking support
  • Perfect for edge computing and desktop apps

When to use: Most users should stick with LanceDB unless they have specific requirements.

Other Vector Database Options

DatabaseTypeSetupBest For
ChromaLocal/CloudServer requiredRapid prototyping, flexibility
PineconeCloudAPI keyEnterprise scale, managed
MilvusSelf-hostedServer requiredMaximum control, large scale
QdrantLocal/CloudServer requiredPerformance, advanced filtering
WeaviateLocal/CloudServer requiredSemantic search
PGVectorPostgreSQLExisting DBPostgres infrastructure
ZillizCloudAPI keyManaged Milvus
AstraDBCloudAPI keyCassandra-based

Configuring Vector Databases

  1. Go to Settings > Vector Database
  2. Select your provider
  3. Enter connection details (URL, API key, etc.)
  4. Click Save

Note: Changing vector databases requires re-embedding all documents.


AI Agents

Agents extend LLM capabilities with real-world tools and actions.

Invoking Agents

Type @agent in any chat to activate agent mode. The agent can then use enabled tools to complete complex tasks.

Built-in Agent Tools

ToolDescriptionAlways Enabled
RAG SearchSearch embedded documentsYes
Summarize DocumentsCondense workspace contentYes
Web BrowsingSearch the internetNo (requires API key)
Web ScrapingExtract and embed website contentNo
Save FilesStore information to local filesNo
List DocumentsView all accessible documentsNo
Chart GeneratorCreate data visualizationsNo
SQL ConnectorQuery databasesNo (requires config)

Enabling Agent Tools

  1. Go to Settings > Agent Skills
  2. Toggle on desired tools
  3. Configure API keys for web search:
    • SerpApi: Google, Amazon, Baidu, Google Maps
    • SearchApi: Google, Bing, Baidu, YouTube

Model Recommendations for Agents

Not all models work well as agents. For best results:

RecommendationWhy
Use 8B+ parametersBetter reasoning capabilities
Prefer 8-bit quantizationMore reliable than 4-bit
Choose tool-calling modelsNative function calling support

Recommended models for agents:

  • Llama 3.1 8B/70B
  • Mistral 7B
  • GPT-4o / GPT-4o-mini
  • Claude 3.5 Sonnet

Agent Flows (No-Code)

Agent Flows provides a visual drag-and-drop interface for building workflows:

  1. Go to Agent Flows section
  2. Create a new flow
  3. Drag components onto canvas
  4. Connect steps and configure triggers
  5. Save and activate

Use cases:

  • Automated document processing
  • Scheduled web scraping
  • Multi-step research workflows
  • Custom chatbot behaviors

Custom Agent Skills

Power users can create custom agent skills with code:

  1. Enable Community Hub (disabled by default for security)
  2. Browse available skills
  3. Or create your own using the SDK
// Example custom skill structure
module.exports = {
  name: "My Custom Skill",
  description: "What this skill does",
  execute: async (context, params) => {
    // Your implementation
    return "Result";
  }
};

Workspaces

Workspaces organize your documents and conversations into isolated environments.

Creating Workspaces

  1. Click + New Workspace in the sidebar
  2. Name your workspace
  3. Add description (optional)
  4. Start uploading documents

Workspace Features

FeatureDescription
Isolated DocumentsEach workspace has its own embedded content
Separate Chat HistoryConversations are workspace-specific
Custom SettingsDifferent LLM per workspace (Docker/Cloud)
Prompt TemplatesCustomizable system prompts
Temperature ControlAdjust creativity per workspace

Workspace Organization Strategies

  • By Project: One workspace per client or project
  • By Topic: Separate workspaces for different knowledge domains
  • By Team: Workspaces scoped to specific user groups
  • By Stage: Development, testing, production workspaces

Multi-User Permissions (Docker/Cloud)

RoleCapabilities
AdminFull access, user management, all settings
ManagerWorkspace creation, document management
Default UserChat access to assigned workspaces

API and Embedding Options

Developer API

AnythingLLM provides a comprehensive REST API:

Endpoints include:

  • Workspace management (CRUD)
  • Document upload and embedding
  • Chat completions
  • User management (Docker/Cloud)
  • System configuration

Example API call:

curl -X POST "http://localhost:3001/api/v1/workspace/my-workspace/chat" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"message": "What are the key findings?"}'

Chat Widget Embedding

Embed AnythingLLM as a chat widget on any website:

<script
  data-embed-id="your-embed-id"
  data-base-api-url="https://your-instance.com/api/embed"
  src="https://your-instance.com/embed/anythingllm-chat-widget.min.js">
</script>

Customization Options:

AttributeDescription
data-positionWidget position (bottom-right, bottom-left, top-right, top-left)
data-assistant-nameCustom assistant name
data-assistant-iconCustom assistant icon URL
data-window-heightChat window height (px, %, rem)
data-window-widthChat window width (px, %, rem)
data-text-sizeChat text size in pixels
data-usernameClient identifier for logging
data-open-on-loadAuto-open widget on page load

Comparison: AnythingLLM vs Alternatives

AnythingLLM vs PrivateGPT vs LibreChat

FeatureAnythingLLMPrivateGPTLibreChat
Primary FocusDocument chat + RAG + AgentsOffline document Q&AChatGPT-style multi-provider UI
GitHub Stars53,000+High33,000+
LLM Providers30+Limited localMany
RAGBuilt-in, production-readyBuilt-inLimited/Plugin
AI AgentsExtensive (web, SQL, charts)BasicGood
No-Code BuilderYesNoNo
Multi-UserYes (Docker)LimitedYes
Enterprise AuthBasicLimitedStrong (OAuth2, SSO)
Best ForDocument-aware AI appsMaximum privacy offlineEnterprise chat teams

When to Choose AnythingLLM

  • Need comprehensive document chat with RAG
  • Want AI agents with real tools (web, SQL, files)
  • Building custom AI applications with embedding
  • Need both local and cloud LLM options
  • Want no-code workflow building
  • Deploying for personal use or small teams

When to Choose Alternatives

Choose PrivateGPT if:

  • Strict offline-only requirement
  • Maximum privacy is paramount
  • Simple document Q&A without extras

Choose LibreChat if:

  • Need enterprise authentication (OAuth2, LDAP, SSO)
  • Want ChatGPT-like interface
  • Large team with complex permissions
  • Primary use is chat, not document RAG

Use Cases and Industry Applications

Industry Applications

IndustryUse Cases
LegalContract analysis, case research, document review
FinanceReport analysis, compliance Q&A, market research
HealthcareMedical literature, patient education, protocols
EducationCourse material Q&A, research assistance, tutoring
SoftwareCode documentation, technical specs, API docs
HRPolicy Q&A, onboarding docs, employee handbook
SalesProduct knowledge, competitor analysis, proposals

Common Workflows

Document Q&A:

  1. Upload company documents to workspace
  2. Ask questions in natural language
  3. Get answers with source citations

Research Assistant:

  1. Enable web browsing agent
  2. Upload existing research papers
  3. Ask agent to research and compare

Meeting Intelligence:

  1. Upload meeting transcripts (or record in Desktop)
  2. Ask for summaries, action items, decisions
  3. Search across all meetings

Customer Support:

  1. Upload product documentation
  2. Create support workspace
  3. Embed chat widget on website
  4. Customers get instant accurate answers

Pricing

Self-Hosted (Free)

FeatureIncluded
Desktop AppFree forever
Docker DeploymentFree forever
All featuresFull access
UpdatesCommunity releases
SupportGitHub issues, Discord

Cloud Hosted

TierPriceFeatures
Starter$50/monthUp to 3 users, 100 documents, private instance
Pro$99/monthLarger teams, 72-hour support SLA
EnterpriseCustomWhite-glove service, on-premise support, SLA

Troubleshooting

Ollama Connection Issues

Problem: "Cannot connect to Ollama"

Solutions:

  1. Ensure Ollama is running: ollama serve
  2. Check the URL is correct for your deployment type
  3. For Docker, verify network connectivity
  4. Check if firewall is blocking port 11434

Documents Not Embedding

Problem: Documents stuck in processing

Solutions:

  1. Check embedding model is configured
  2. Verify vector database is running
  3. Check available disk space
  4. Review error logs in console

Agent Not Working

Problem: Agent commands not executing

Solutions:

  1. Verify agent tools are enabled in settings
  2. Check model supports tool calling
  3. Ensure API keys are configured for web tools
  4. Try a larger model (8B+ parameters)

Performance Issues

Problem: Slow response times

Solutions:

  1. Use smaller/quantized models
  2. Reduce chunk size for documents
  3. Limit context window in workspace settings
  4. Consider cloud LLM for faster inference

Key Takeaways

  1. AnythingLLM is the most comprehensive all-in-one local AI platform
  2. Built-in RAG makes document chat simple with zero configuration
  3. 30+ LLM providers give maximum flexibility—local or cloud
  4. AI agents extend capabilities with web, SQL, and file tools
  5. No-code builder enables workflow automation without programming
  6. Works 100% offline with Ollama or built-in engine
  7. Free and open-source for self-hosting—paid cloud available

Next Steps

  1. Set up Ollama for local models
  2. Learn RAG in depth
  3. Compare vector databases for your use case
  4. Explore AI agents capabilities

AnythingLLM is the most complete local AI platform available—combining RAG, agents, and flexibility in one polished package. Whether you're building private document chat for personal use or deploying AI-powered applications for your organization, AnythingLLM provides the tools you need without compromising on privacy or control.

🚀 Join 12K+ developers
Newsletter

Ready to start your AI career?

Get the complete roadmap

Download the AI Starter Kit: Career path, fundamentals, and cheat sheets used by 12K+ developers.

No spam, everUnsubscribe anytime
12,000+ downloads
Reading now
Join the discussion

Local AI Master Research Team

Creator of Local AI Master. I've built datasets with over 77,000 examples and trained AI models from scratch. Now I help people achieve AI independence through local AI mastery.

My 77K Dataset Insights Delivered Weekly

Get exclusive access to real dataset optimization strategies and AI model performance tips.

Want structured AI education?

10 courses, 160+ chapters, from $9. Understand AI, don't just use it.

AI Learning Path

Comments (0)

No comments yet. Be the first to share your thoughts!

📅 Published: February 4, 2026🔄 Last Updated: February 4, 2026✓ Manually Reviewed

My 77K Dataset Insights Delivered Weekly

Get exclusive access to real dataset optimization strategies and AI model performance tips.

Was this helpful?

PR

Written by Pattanaik Ramswarup

AI Engineer & Dataset Architect | Creator of the 77,000 Training Dataset

I've personally trained over 50 AI models from scratch and spent 2,000+ hours optimizing local AI deployments. My 77K dataset project revolutionized how businesses approach AI training. Every guide on this site is based on real hands-on experience, not theory. I test everything on my own hardware before writing about it.

✓ 10+ Years in ML/AI✓ 77K Dataset Creator✓ Open Source Contributor
Free Tools & Calculators