Free course — 2 free chapters of every course. No credit card.Start learning free
HISTORICAL MODEL — April 2023

Phoenix 7B

Multilingual Chat Model for 40+ Languages

Updated: March 16, 2026

Historical Context

Phoenix 7B was released in April 2023 by FreedomIntelligence (CUHK Shenzhen) as part of their "Democratizing ChatGPT across Languages" research. Built on BLOOMZ-7B1-MT, it was notable for supporting 40+ languages including many low-resource ones. In 2026, modern multilingual models like Qwen 2.5 and Llama 3 offer dramatically better performance.

7B
Parameters
40+
Languages
2048
Context Tokens
Free
Open Source

What Is Phoenix 7B?

Phoenix 7B is a multilingual instruction-tuned language model developed by the FreedomIntelligence research group at the Chinese University of Hong Kong, Shenzhen (CUHK-SZ). The project aimed to "democratize ChatGPT across languages" — creating an open-source chatbot that could serve users in languages underrepresented by English-centric models.

Phoenix was built on top of BLOOMZ-7B1-MT, the multilingual variant of the BLOOM model family developed by BigScience. By starting from a base model already trained on 46 languages, Phoenix inherited broad multilingual capabilities that were then enhanced through instruction tuning on conversation data in multiple languages.

The name "Phoenix" was chosen to represent the model's goal of giving new life to language diversity in AI — enabling speakers of underserved languages to access ChatGPT-like capabilities locally and for free.

Technical Architecture

Base Model: BLOOMZ-7B1-MT

  • Architecture: Transformer decoder-only
  • Parameters: ~7.1 billion
  • Hidden Size: 4096
  • Layers: 30 transformer blocks
  • Attention Heads: 32
  • Context Length: 2048 tokens
  • Vocabulary: 250,680 tokens
  • Pre-training: ROOTS corpus (46 languages, 1.6TB text)

BLOOM's vocabulary is one of the largest of any LLM — designed for efficient multilingual tokenization.

Phoenix Instruction Tuning

  • Method: Supervised fine-tuning (SFT)
  • Training Data: User-shared conversations + multilingual instructions
  • Languages Covered: 40+ in training data
  • Chat Format: Multi-turn conversation support
  • Organization: FreedomIntelligence, CUHK-SZ
  • Release Date: April 2023
  • Companion Model: Chimera (LLaMA-based variant)
  • Paper: arXiv:2304.10453

Why BLOOMZ as the Base?

The FreedomIntelligence team chose BLOOMZ-7B1-MT (rather than LLaMA) because BLOOM was pre-trained on 46 languages from the start. LLaMA's training data was ~95% English, making it a poor foundation for multilingual work. BLOOM's 250K-token vocabulary also handles non-Latin scripts (Arabic, Chinese, Hindi, etc.) far more efficiently than LLaMA's 32K English-centric vocabulary.

Note: FreedomIntelligence also released "Chimera", a LLaMA-13B based variant for comparison. The paper found Chimera performed better on English tasks while Phoenix was better for non-English languages.

Multilingual Capabilities

Phoenix's key strength was broad language coverage inherited from BLOOM. While quality varied significantly across languages, it provided usable instruction-following in many languages where no open-source alternative existed in early 2023.

Strong Support

Languages with good training data coverage

  • English, Chinese (Simplified/Traditional)
  • French, Spanish, Portuguese
  • Arabic, Hindi
  • Indonesian, Vietnamese

Moderate Support

Usable but lower quality

  • German, Italian, Dutch
  • Russian, Turkish, Thai
  • Japanese, Korean
  • Bengali, Urdu, Swahili

Basic/Limited

Inherited from BLOOM, limited fine-tuning

  • Many African languages (Yoruba, Igbo, etc.)
  • Southeast Asian (Khmer, Lao, Myanmar)
  • Various Indic languages
  • Other low-resource languages from ROOTS

Important Limitation

While Phoenix supports many languages, its instruction-following quality in most non-English languages was significantly below ChatGPT-level even in 2023. The model often reverted to English for complex queries, produced inconsistent output quality across languages, and sometimes mixed languages within a single response. This was honest and acknowledged in the paper — the goal was demonstrating feasibility, not claiming state-of-the-art multilingual performance.

Honest Performance Assessment

Benchmark Context

Phoenix was evaluated primarily through human evaluation and GPT-4 scoring rather than standard benchmarks like MMLU. Its BLOOM base model scored modestly on English benchmarks. The research focus was on multilingual capability breadth rather than peak English performance.

BLOOM-7B1 Base Model Benchmarks

Phoenix inherits BLOOM's knowledge capabilities — instruction tuning improved format but not core knowledge.

BenchmarkBLOOM 7B1LLaMA 7BMistral 7B
MMLU (5-shot)~26%35.1%60.1%
HellaSwag~59%76.1%81.3%
ARC (Challenge)~36%47.6%55.5%
TruthfulQA~31%33.0%42.2%

Source: Open LLM Leaderboard, BigScience BLOOM evaluation. BLOOM scores lower than LLaMA on English benchmarks because its training data was split across 46 languages (vs LLaMA's ~95% English).

What Phoenix Does Well

  • Basic chat in 40+ languages (no other open model did this in 2023)
  • Chinese-English bilingual conversations
  • Simple instruction following in non-English languages
  • Demonstrating multilingual instruction-tuning feasibility
  • Running completely offline with multilingual capability

Where Phoenix Falls Short

  • English performance below LLaMA 7B (BLOOM base limitation)
  • Complex reasoning in any language
  • Code generation (minimal training data)
  • Long-form content (2048 token context)
  • Consistency — often switches languages mid-response
  • Low-resource languages still very limited quality

VRAM Requirements

QuantizationFile SizeVRAM RequiredQuality ImpactNotes
Q4_K_M~4.5 GB~5.5 GBModerate lossMultilingual quality affected more than English
Q5_K_M~5.3 GB~6.3 GBAcceptableBetter for non-Latin scripts
Q8_0~7.5 GB~8.5 GBNear-losslessRecommended for multilingual use
FP16~14.5 GB~15.5 GBFull precision24GB+ GPU required

BLOOM's 250K vocabulary means larger embedding tables and slightly more VRAM than LLaMA-based 7B models at the same quantization level.

Running Phoenix 7B

Availability Note

Phoenix 7B is not available on Ollama. It can be run via HuggingFace Transformers. For multilingual chat in 2026, modern Ollama models like Qwen 2.5 and Llama 3 are dramatically better options.

Using HuggingFace Transformers

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_name = "FreedomIntelligence/phoenix-inst-chat-7b"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto"
)

# Phoenix uses a conversation format
prompt = "A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions.\n\nHuman: Explain quantum computing in simple terms.\nAssistant:"

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Requires ~15GB VRAM for FP16. Use load_in_4bit=True with bitsandbytes for ~5GB VRAM.

License

Phoenix 7B was released under the Apache 2.0 license, permitting both research and commercial use. However, the BLOOMZ base model has its own license terms (BigScience RAIL License) which include responsible use restrictions. Check both license terms before commercial deployment.

Modern Multilingual Alternatives (2026)

For multilingual tasks in 2026, these models dramatically outperform Phoenix 7B:

ModelSizeMMLULanguagesContextOllama
Phoenix 7B7B~26%40+2KNot available
Qwen 2.5 7B7B~74%29+128Kollama run qwen2.5:7b
Llama 3.2 3B3B~63%8+128Kollama run llama3.2:3b
Gemma 2 9B9B~72%Multi8Kollama run gemma2:9b

Qwen 2.5 7B is the strongest multilingual recommendation — 3x higher MMLU, 64x longer context, and available on Ollama with Apache 2.0 license.

Reading now
Join the discussion

Was this helpful?

Build Real AI on Your Machine

RAG, agents, NLP, vision, and MLOps - chapters across 10 courses that take you from reading about AI to building AI.

Frequently Asked Questions

What is Phoenix 7B?

Phoenix 7B is a multilingual instruction-tuned chat model created by FreedomIntelligence (CUHK Shenzhen) in April 2023. Built on BLOOMZ-7B1-MT, it supports conversations in 40+ languages, particularly focusing on languages underserved by English-centric models.

Can I run Phoenix 7B on Ollama?

No, Phoenix 7B is not available on Ollama. You can run it via HuggingFace Transformers. For multilingual chat on Ollama, use ollama run qwen2.5:7b instead.

Is Phoenix 7B still worth using in 2026?

For practical use, no. Modern models like Qwen 2.5 7B offer 3x higher MMLU scores, 64x longer context, and better multilingual performance. Phoenix is primarily of historical interest as an early multilingual open-source chat model.

How much VRAM does Phoenix 7B need?

Phoenix 7B requires ~5.5GB VRAM with Q4_K_M quantization, ~8.5GB with Q8_0, or ~15.5GB at FP16 precision. BLOOM's large 250K vocabulary means slightly more memory than LLaMA-based 7B models.

Sources & References

PR

Written by Pattanaik Ramswarup

Creator of Local AI Master

I build Local AI Master around practical, testable local AI workflows: model selection, hardware planning, RAG systems, agents, and MLOps. The goal is to turn scattered tutorials into a structured learning path you can follow on your own hardware.

✓ Local AI Curriculum✓ Hands-On Projects✓ Open Source Contributor
📅 Published: January 25, 2025🔄 Last Updated: March 16, 2026✓ Manually Reviewed

Related Guides

Continue your local AI journey with these comprehensive guides

🎯
AI Learning Path

Go from reading about AI to building with AI

10 structured courses. Hands-on projects. Runs on your machine. Start free.

Free Tools & Calculators