COMMUNITY MODEL -- LLaMA 1 ERA (2023)

Manticore 13B: Early Community Multi-Dataset Merge

Honest technical review of Manticore 13B -- a historically interesting 2023 community fine-tune that merged multiple training datasets on LLaMA-1 13B. Real MMLU ~50-53%, 2048 context window. Completely surpassed by modern models.

13B
Parameters
~51%
MMLU (estimated)
2048
Context Tokens
Legacy
Status (2023)
51
MMLU Estimate (Community 13B)
Fair
📅 Published: June 1, 2023🔄 Last Updated: March 13, 2026✓ Manually Reviewed

Important Context

Manticore 13B was released in May-June 2023 by the Open Access AI Collective. It is a LLaMA-1 based community fine-tune that merged multiple training datasets. While historically interesting as an early example of community model merging, it has been completely surpassed by modern models like Llama 3.2, Phi-3, Mistral 7B, and Qwen 2.5 -- many of which are smaller yet far more capable. This page provides an honest technical assessment for historical reference.

What Is Manticore 13B?

Manticore 13B (openaccess-ai-collective/manticore-13b) is a community-created large language model built on Meta's original LLaMA-1 13B base model. It was created by the Open Access AI Collective in mid-2023 as an experiment in multi-dataset fine-tuning -- combining several popular instruction-tuning datasets into a single training run.

Model Details

Base ModelLLaMA-1 13B (Meta)
Parameters13 billion
Context Length2,048 tokens
ArchitectureStandard LLaMA transformer
Release DateMay-June 2023
LicenseLLaMA License (non-commercial)
CreatorOpen Access AI Collective

Key Characteristics

  • Multi-dataset merge: combined ShareGPT, Alpaca, coding, and other datasets
  • Aimed for versatility rather than specialization
  • Relatively uncensored compared to commercial models of the era
  • Part of the early wave of community LLaMA fine-tunes
  • Non-commercial license inherited from LLaMA-1

Multi-Dataset Merge Training

Manticore 13B's distinguishing feature was its training approach: combining multiple popular instruction-tuning datasets into a single fine-tuning run. This was an early experiment in what the community now calls "dataset merging" -- the idea that exposure to diverse training data could produce a more versatile model than single-dataset fine-tunes like pure Alpaca or pure ShareGPT models.

Training Datasets Used

ShareGPT

Conversations from ChatGPT shared by users -- gave conversational ability

Alpaca

Stanford's instruction-following dataset -- gave instruction compliance

GPT4All

Diverse instruction data -- broadened general knowledge

Coding Datasets

Code-related instruction data -- added basic coding ability

Historical note: In mid-2023, this multi-dataset approach was novel. Today, techniques like DPO, RLHF, and carefully curated synthetic data have largely replaced naive dataset merging. Models like Llama 3 and Qwen 2.5 achieve far better results with more sophisticated training pipelines.

Real Benchmark Performance

Benchmark Honesty Note

Manticore 13B is a community fine-tune of LLaMA-1 13B from 2023. Its MMLU performance is estimated at roughly 50-53%, which is typical for community 13B models of that era. It does not outperform GPT-4, Claude, or any other frontier model -- such claims would be absurd for any 13B community fine-tune from 2023. Benchmarks below compare it against its actual peers: other LLaMA-1 era community models.

MMLU Comparison (Peer Models)

MMLU Score (%) -- LLaMA-1 13B Era Models

Manticore-13B51 MMLU accuracy (5-shot)
51
LLaMA-1 13B (base)47 MMLU accuracy (5-shot)
47
Vicuna-13B52 MMLU accuracy (5-shot)
52
Alpaca-13B48 MMLU accuracy (5-shot)
48

Source: Community benchmarks from HuggingFace Open LLM Leaderboard (2023). Exact Manticore numbers are estimated from similar community 13B models.

Capability Estimates

Performance Metrics

General Chat
55
Creative Writing
58
Code Generation
35
Reasoning
40
Instruction Following
52
Roleplay
60

Estimated capability scores based on community usage reports and comparable model benchmarks.

What Manticore 13B Can and Cannot Do

Reasonable For:

  • +Basic conversational chat
  • +Simple creative writing and roleplay
  • +Basic question answering
  • +Relatively uncensored outputs

Not Suitable For:

  • -Production code generation (poor accuracy)
  • -Complex reasoning or math
  • -Long documents (2048 token limit)
  • -Factual accuracy (prone to hallucination)
  • -Commercial use (LLaMA-1 license restriction)

VRAM Requirements by Quantization

Manticore 13B GGUF quantized files were provided by TheBloke on HuggingFace. Here are the real VRAM requirements for different quantization levels:

QuantizationFile SizeVRAM RequiredQuality LossRecommended GPU
Q4_K_M~7.4 GB~8 GBModerateRTX 3060 12GB / RTX 4060 8GB
Q5_K_M~9.0 GB~10 GBLowRTX 3060 12GB / RTX 4070
Q8_0~13.8 GB~15 GBMinimalRTX 4070 Ti / RTX 3090
FP16~26 GB~28 GBNoneRTX 3090 / RTX 4090

Source: TheBloke/Manticore-13B-GGUF on HuggingFace. VRAM estimates include model weights + KV cache for 2048 context.

Installation Guide

Ollama Availability

Manticore 13B is not available in the official Ollama model library. You will need to use llama.cpp directly with GGUF files from HuggingFace, or create a custom Ollama Modelfile. The recommended approach is llama.cpp.

llama.cpp (Recommended)

Terminal
$wget https://huggingface.co/TheBloke/Manticore-13B-GGUF/resolve/main/manticore-13b.Q4_K_M.gguf
Downloading manticore-13b.Q4_K_M.gguf (7.37 GB)... [████████████████████████████████] 100% Saved to ./manticore-13b.Q4_K_M.gguf
$./llama-cli -m manticore-13b.Q4_K_M.gguf -n 256 -ngl 35 --temp 0.7
llama_model_loader: loaded meta data with 19 key-value pairs llm_load_tensors: offloading 35 layers to GPU llm_load_tensors: VRAM used: 7168 MB system_info: AVX2 = 1 | CUDA = 1 > Hello! How can I assist you today?
$_

The -ngl 35 flag offloads all layers to GPU. Reduce this number if you have less VRAM.

Custom Ollama Modelfile (Alternative)

Terminal
$cat > Modelfile << EOF FROM ./manticore-13b.Q4_K_M.gguf PARAMETER temperature 0.7 PARAMETER num_ctx 2048 SYSTEM "You are a helpful assistant." EOF
$ollama create manticore-13b -f Modelfile
transferring model data... creating model layer... writing manifest... success
$ollama run manticore-13b
>>> Send a message (/? for help)
$_

Python (Transformers)

Terminal
$pip install transformers torch accelerate
Successfully installed transformers torch accelerate
$python3 -c " from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained('openaccess-ai-collective/manticore-13b') model = AutoModelForCausalLM.from_pretrained('openaccess-ai-collective/manticore-13b', device_map='auto') print('Model loaded successfully') "
Loading checkpoint shards: 100% Model loaded successfully
$_

Note: Full FP16 loading requires ~26GB VRAM. Use load_in_4bit=True with bitsandbytes for reduced memory.

Honest Assessment

Manticore 13B was a product of its time. In mid-2023, the open-source LLM community was rapidly experimenting with fine-tuning Meta's leaked LLaMA-1 weights. Manticore's multi-dataset approach was innovative for its era, but the results were modest by today's standards.

What It Did Right

Demonstrated that combining multiple datasets could produce a more well-rounded model than single-dataset fine-tunes. Contributed to the community understanding of instruction tuning. Provided an accessible, relatively uncensored model for experimentation.

Limitations

Limited to 2048 context (LLaMA-1 limitation). Non-commercial license. No RLHF or preference optimization -- just supervised fine-tuning. Prone to hallucination. Mediocre code generation. Limited reasoning ability compared to even small modern models.

Why You Should Use Something Else Today

A modern 3B parameter model like Llama 3.2 3B or Phi-3 Mini will outperform Manticore 13B on virtually every benchmark while using less than half the VRAM. The only reason to run Manticore today would be historical curiosity or very specific uncensored use cases where you need a LLaMA-1 era model.

Better Modern Alternatives

Model Comparison

ModelSizeRAM RequiredSpeedQualityCost/Month
Manticore-13B~7.4GB (Q4_K_M GGUF)~10GB total~20-35 tok/s (GPU)
51%
$0 (LLaMA license)
Vicuna-13B~7.4GB (Q4_K_M GGUF)~10GB total~20-35 tok/s (GPU)
52%
$0 (LLaMA license)
Llama-3.2-3B~2.0GB (Q4_K_M GGUF)~4GB total~60-90 tok/s (GPU)
63%
$0 (Meta license)
Phi-3 Mini 3.8B~2.3GB (Q4_K_M GGUF)~4GB total~50-80 tok/s (GPU)
69%
$0 (MIT license)

Quality scores are MMLU estimates. Modern smaller models significantly outperform legacy 13B community fine-tunes.

Recommended Replacements

For General Chat

Llama 3.2 3B -- Better MMLU (63%), 128K context, 4x less VRAM, permissive license, available on Ollama.

ollama run llama3.2

For Coding

Qwen 2.5 Coder 7B -- Dramatically better code generation, 128K context, Apache 2.0 license.

ollama run qwen2.5-coder:7b

For Reasoning

Phi-3 Mini 3.8B -- MMLU 69%, excellent reasoning for its size, MIT license.

ollama run phi3:mini

For Uncensored Use

Mistral 7B -- Much better quality, relatively open, Apache 2.0 license, widely supported.

ollama run mistral

Historical Significance

Manticore 13B holds a place in the history of open-source AI as part of the first wave of community fine-tunes that followed Meta's LLaMA-1 release in early 2023. Along with models like Vicuna, Alpaca, and Koala, it demonstrated that community-driven model development could rapidly iterate on foundational models.

Timeline Context

Feb 2023Meta releases LLaMA-1 (research license)
Mar 2023Stanford Alpaca fine-tune released
Apr 2023Vicuna, Koala, and other fine-tunes proliferate
May 2023Manticore 13B released -- multi-dataset merge experiment
Jul 2023LLaMA-2 released with commercial license, making LLaMA-1 fine-tunes largely obsolete
Sep 2023Mistral 7B released, outperforming all 13B LLaMA-1 fine-tunes

The multi-dataset merging approach pioneered by models like Manticore evolved into more sophisticated techniques. Today's model merging tools (like mergekit) and training approaches (DPO, RLHF) owe something to these early experiments, even if the specific models have been entirely superseded.

Get AI Breakthroughs Before Everyone Else

Join 10,000+ developers mastering local AI with weekly exclusive insights.

PR

Written by Pattanaik Ramswarup

AI Engineer & Dataset Architect | Creator of the 77,000 Training Dataset

I've personally trained over 50 AI models from scratch and spent 2,000+ hours optimizing local AI deployments. My 77K dataset project revolutionized how businesses approach AI training. Every guide on this site is based on real hands-on experience, not theory. I test everything on my own hardware before writing about it.

✓ 10+ Years in ML/AI✓ 77K Dataset Creator✓ Open Source Contributor

Related Guides

Continue your local AI journey with these comprehensive guides

Continue Learning

Explore modern local AI models that have surpassed early community fine-tunes:

Reading now
Join the discussion
Free Tools & Calculators