Phoenix 7B
Multilingual Chat Model for 40+ Languages
Updated: March 16, 2026
Historical Context
Phoenix 7B was released in April 2023 by FreedomIntelligence (CUHK Shenzhen) as part of their "Democratizing ChatGPT across Languages" research. Built on BLOOMZ-7B1-MT, it was notable for supporting 40+ languages including many low-resource ones. In 2026, modern multilingual models like Qwen 2.5 and Llama 3 offer dramatically better performance.
What Is Phoenix 7B?
Phoenix 7B is a multilingual instruction-tuned language model developed by the FreedomIntelligence research group at the Chinese University of Hong Kong, Shenzhen (CUHK-SZ). The project aimed to "democratize ChatGPT across languages" — creating an open-source chatbot that could serve users in languages underrepresented by English-centric models.
Phoenix was built on top of BLOOMZ-7B1-MT, the multilingual variant of the BLOOM model family developed by BigScience. By starting from a base model already trained on 46 languages, Phoenix inherited broad multilingual capabilities that were then enhanced through instruction tuning on conversation data in multiple languages.
The name "Phoenix" was chosen to represent the model's goal of giving new life to language diversity in AI — enabling speakers of underserved languages to access ChatGPT-like capabilities locally and for free.
Technical Architecture
Base Model: BLOOMZ-7B1-MT
- Architecture: Transformer decoder-only
- Parameters: ~7.1 billion
- Hidden Size: 4096
- Layers: 30 transformer blocks
- Attention Heads: 32
- Context Length: 2048 tokens
- Vocabulary: 250,680 tokens
- Pre-training: ROOTS corpus (46 languages, 1.6TB text)
BLOOM's vocabulary is one of the largest of any LLM — designed for efficient multilingual tokenization.
Phoenix Instruction Tuning
- Method: Supervised fine-tuning (SFT)
- Training Data: User-shared conversations + multilingual instructions
- Languages Covered: 40+ in training data
- Chat Format: Multi-turn conversation support
- Organization: FreedomIntelligence, CUHK-SZ
- Release Date: April 2023
- Companion Model: Chimera (LLaMA-based variant)
- Paper: arXiv:2304.10453
Why BLOOMZ as the Base?
The FreedomIntelligence team chose BLOOMZ-7B1-MT (rather than LLaMA) because BLOOM was pre-trained on 46 languages from the start. LLaMA's training data was ~95% English, making it a poor foundation for multilingual work. BLOOM's 250K-token vocabulary also handles non-Latin scripts (Arabic, Chinese, Hindi, etc.) far more efficiently than LLaMA's 32K English-centric vocabulary.
Note: FreedomIntelligence also released "Chimera", a LLaMA-13B based variant for comparison. The paper found Chimera performed better on English tasks while Phoenix was better for non-English languages.
Multilingual Capabilities
Phoenix's key strength was broad language coverage inherited from BLOOM. While quality varied significantly across languages, it provided usable instruction-following in many languages where no open-source alternative existed in early 2023.
Strong Support
Languages with good training data coverage
- English, Chinese (Simplified/Traditional)
- French, Spanish, Portuguese
- Arabic, Hindi
- Indonesian, Vietnamese
Moderate Support
Usable but lower quality
- German, Italian, Dutch
- Russian, Turkish, Thai
- Japanese, Korean
- Bengali, Urdu, Swahili
Basic/Limited
Inherited from BLOOM, limited fine-tuning
- Many African languages (Yoruba, Igbo, etc.)
- Southeast Asian (Khmer, Lao, Myanmar)
- Various Indic languages
- Other low-resource languages from ROOTS
Important Limitation
While Phoenix supports many languages, its instruction-following quality in most non-English languages was significantly below ChatGPT-level even in 2023. The model often reverted to English for complex queries, produced inconsistent output quality across languages, and sometimes mixed languages within a single response. This was honest and acknowledged in the paper — the goal was demonstrating feasibility, not claiming state-of-the-art multilingual performance.
Honest Performance Assessment
Benchmark Context
Phoenix was evaluated primarily through human evaluation and GPT-4 scoring rather than standard benchmarks like MMLU. Its BLOOM base model scored modestly on English benchmarks. The research focus was on multilingual capability breadth rather than peak English performance.
BLOOM-7B1 Base Model Benchmarks
Phoenix inherits BLOOM's knowledge capabilities — instruction tuning improved format but not core knowledge.
| Benchmark | BLOOM 7B1 | LLaMA 7B | Mistral 7B |
|---|---|---|---|
| MMLU (5-shot) | ~26% | 35.1% | 60.1% |
| HellaSwag | ~59% | 76.1% | 81.3% |
| ARC (Challenge) | ~36% | 47.6% | 55.5% |
| TruthfulQA | ~31% | 33.0% | 42.2% |
Source: Open LLM Leaderboard, BigScience BLOOM evaluation. BLOOM scores lower than LLaMA on English benchmarks because its training data was split across 46 languages (vs LLaMA's ~95% English).
What Phoenix Does Well
- Basic chat in 40+ languages (no other open model did this in 2023)
- Chinese-English bilingual conversations
- Simple instruction following in non-English languages
- Demonstrating multilingual instruction-tuning feasibility
- Running completely offline with multilingual capability
Where Phoenix Falls Short
- English performance below LLaMA 7B (BLOOM base limitation)
- Complex reasoning in any language
- Code generation (minimal training data)
- Long-form content (2048 token context)
- Consistency — often switches languages mid-response
- Low-resource languages still very limited quality
VRAM Requirements
| Quantization | File Size | VRAM Required | Quality Impact | Notes |
|---|---|---|---|---|
| Q4_K_M | ~4.5 GB | ~5.5 GB | Moderate loss | Multilingual quality affected more than English |
| Q5_K_M | ~5.3 GB | ~6.3 GB | Acceptable | Better for non-Latin scripts |
| Q8_0 | ~7.5 GB | ~8.5 GB | Near-lossless | Recommended for multilingual use |
| FP16 | ~14.5 GB | ~15.5 GB | Full precision | 24GB+ GPU required |
BLOOM's 250K vocabulary means larger embedding tables and slightly more VRAM than LLaMA-based 7B models at the same quantization level.
Running Phoenix 7B
Availability Note
Phoenix 7B is not available on Ollama. It can be run via HuggingFace Transformers. For multilingual chat in 2026, modern Ollama models like Qwen 2.5 and Llama 3 are dramatically better options.
Using HuggingFace Transformers
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_name = "FreedomIntelligence/phoenix-inst-chat-7b"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.float16,
device_map="auto"
)
# Phoenix uses a conversation format
prompt = "A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions.\n\nHuman: Explain quantum computing in simple terms.\nAssistant:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))Requires ~15GB VRAM for FP16. Use load_in_4bit=True with bitsandbytes for ~5GB VRAM.
License
Phoenix 7B was released under the Apache 2.0 license, permitting both research and commercial use. However, the BLOOMZ base model has its own license terms (BigScience RAIL License) which include responsible use restrictions. Check both license terms before commercial deployment.
Modern Multilingual Alternatives (2026)
For multilingual tasks in 2026, these models dramatically outperform Phoenix 7B:
| Model | Size | MMLU | Languages | Context | Ollama |
|---|---|---|---|---|---|
| Phoenix 7B | 7B | ~26% | 40+ | 2K | Not available |
| Qwen 2.5 7B | 7B | ~74% | 29+ | 128K | ollama run qwen2.5:7b |
| Llama 3.2 3B | 3B | ~63% | 8+ | 128K | ollama run llama3.2:3b |
| Gemma 2 9B | 9B | ~72% | Multi | 8K | ollama run gemma2:9b |
Qwen 2.5 7B is the strongest multilingual recommendation — 3x higher MMLU, 64x longer context, and available on Ollama with Apache 2.0 license.
Was this helpful?
Build Real AI on Your Machine
RAG, agents, NLP, vision, and MLOps - chapters across 10 courses that take you from reading about AI to building AI.
Frequently Asked Questions
What is Phoenix 7B?
Phoenix 7B is a multilingual instruction-tuned chat model created by FreedomIntelligence (CUHK Shenzhen) in April 2023. Built on BLOOMZ-7B1-MT, it supports conversations in 40+ languages, particularly focusing on languages underserved by English-centric models.
Can I run Phoenix 7B on Ollama?
No, Phoenix 7B is not available on Ollama. You can run it via HuggingFace Transformers. For multilingual chat on Ollama, use ollama run qwen2.5:7b instead.
Is Phoenix 7B still worth using in 2026?
For practical use, no. Modern models like Qwen 2.5 7B offer 3x higher MMLU scores, 64x longer context, and better multilingual performance. Phoenix is primarily of historical interest as an early multilingual open-source chat model.
How much VRAM does Phoenix 7B need?
Phoenix 7B requires ~5.5GB VRAM with Q4_K_M quantization, ~8.5GB with Q8_0, or ~15.5GB at FP16 precision. BLOOM's large 250K vocabulary means slightly more memory than LLaMA-based 7B models.
Sources & References
- arXiv:2304.10453 — "Phoenix: Democratizing ChatGPT across Languages" — Chen et al., 2023 (original paper)
- FreedomIntelligence/phoenix-inst-chat-7b — HuggingFace — Official model page
- github.com/FreedomIntelligence/LLMZoo — LLM Zoo repository (Phoenix + Chimera)
- arXiv:2211.05100 — "BLOOM: A 176B-Parameter Open-Access Multilingual Language Model" — BigScience, 2022 (base model family)
Written by Pattanaik Ramswarup
Creator of Local AI Master
I build Local AI Master around practical, testable local AI workflows: model selection, hardware planning, RAG systems, agents, and MLOps. The goal is to turn scattered tutorials into a structured learning path you can follow on your own hardware.
Related Guides
Continue your local AI journey with these comprehensive guides
Go from reading about AI to building with AI
10 structured courses. Hands-on projects. Runs on your machine. Start free.