What is DUS (Depth Up-Scaling) in Solar 10.7B?

DUS is Upstage's method for creating larger models efficiently. It takes a pretrained Llama 2 model, duplicates some transformer layers to increase depth from 32 to 48 layers, then continues pretraining. This is cheaper than training 10.7B parameters from scratch.

How much VRAM does Solar 10.7B need?

With Q4_K_M quantization (Ollama default): about 6-7 GB VRAM. This fits on an RTX 3060 12GB or Apple M1 8GB. For FP16 full precision, you need about 24 GB VRAM (RTX 3090/4090).

What is the difference between Solar 10.7B Base and Instruct?

The base model is pretrained on next-token prediction only, suitable for fine-tuning and research. The Instruct version adds SFT and DPO alignment for instruction following and chat. Most users should use the Instruct version.

How do I run Solar 10.7B locally?

The easiest method is Ollama: install with 'curl -fsSL https://ollama.com/install.sh | sh', then run 'ollama run solar'. This downloads a Q4_K_M quantized version requiring about 6-7 GB VRAM.

★ Reading this for free? Get 20 structured AI courses + per-chapter AI tutor — the first chapter of every course free, no card.Start free in 30 seconds

LLMs you can run locally AI hardware

Solar 10.7B Base
Depth Up-Scaling Architecture & Local Deployment Guide

Name: Solar 10.7B Specifications & Benchmarks
Creator: Local AI Master
License: https://opensource.org/licenses/MIT

Solar 10.7B is a base language model from Korean AI company Upstage, released in December 2023. Its key innovation is Depth Up-Scaling (DUS): rather than training a 10.7B model from scratch, Upstage duplicated layers from a pretrained Llama 2 base and continued pretraining, producing a larger model that inherits existing knowledge. This is the base (pretrained) version; for the instruction-tuned variant, see Solar 10.7B Instruct.

10.7B

Parameters

~66%

MMLU (5-shot)

4,096

Context Tokens

Apache 2.0

License

Technical Overview

Model Specifications

Developer: Upstage (Seoul, South Korea)
Release Date: December 2023
Parameters: 10.7 billion
Architecture: DUS (Depth Up-Scaling) based on Llama 2
Layers: 48 transformer layers
Hidden Dimension: 4,096
Attention Heads: 32
Context Window: 4,096 tokens
Vocabulary: 32,000 tokens (Llama 2 tokenizer)
License: Apache 2.0 (fully open, commercial use allowed)
Model Type: Base (pretrained, not instruction-tuned)

What Makes Solar Different

Solar 10.7B stands out for one reason: DUS (Depth Up-Scaling). Instead of training from scratch, Upstage took a pretrained Llama 2 model and duplicated its transformer layers to create a deeper network. They then continued pretraining on additional data.

This approach has a key advantage: training a 10.7B model via DUS is significantly cheaper than training one from random initialization, because the duplicated layers already contain useful representations.

As a base model, Solar 10.7B is pretrained on next-token prediction but not fine-tuned for following instructions. It is primarily useful for:

- Fine-tuning on your own dataset
- Text completion tasks
- Research into DUS architecture
- Building custom instruction-tuned variants

DUS Architecture Explained

How Depth Up-Scaling Works

DUS is described in Upstage's paper "SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling" (December 2023). The process:

Step 1

Start with Pretrained Llama 2

Begin with a pretrained 7B-class Llama 2 model that has 32 transformer layers and already contains general language knowledge from pretraining.

Step 2

Duplicate Layers

Copy a subset of the transformer layers and stack them on top, increasing the model from 32 to 48 layers. This grows the parameter count from ~7B to 10.7B without random initialization.

Step 3

Continue Pretraining

Continue pretraining on additional data so the duplicated layers learn to differentiate from their originals and the full model converges to a coherent 48-layer network.

DUS vs Other Scaling Methods

Method	Approach	Training Cost	Example
DUS (Solar)	Duplicate layers from pretrained model + continue training	Low	Solar 10.7B
Train from scratch	Random initialization, full pretraining	Very High	Llama 2, Mistral
MoE	Multiple expert sub-networks, sparse activation	Medium-High	Mixtral 8x7B
Knowledge Distillation	Smaller model trained to mimic larger teacher	Low-Medium	TinyLlama

Source: Upstage, "SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling" (arXiv:2312.15166)

Base vs Instruct: Which to Use

Upstage released two versions of Solar 10.7B. This page covers the base model. If you want a chatbot or instruction-following assistant, use the Instruct version instead.

Feature	Solar 10.7B Base (this page)	Solar 10.7B Instruct
HuggingFace ID	upstage/SOLAR-10.7B-v1.0	upstage/SOLAR-10.7B-Instruct-v1.0
Training	Pretrained (next-token prediction)	+ SFT + DPO alignment
Best For	Fine-tuning, text completion, research	Chat, Q&A, instruction following
MMLU	~66%	~66.2% (marginal improvement)
Ollama	`ollama run solar`	`ollama run solar:10.7b-instruct-v1-q4_K_M`

Recommendation: Most users should use the Instruct version. The base model is primarily for researchers and developers who want to fine-tune on their own data.

Benchmarks

MMLU Comparison (5-shot, base models)

Solar 10.7B66 MMLU accuracy %

Llama 2 13B55 MMLU accuracy %

Mistral 7B60 MMLU accuracy %

Yi 34B76 MMLU accuracy %

Source: HuggingFace Open LLM Leaderboard (v1). Yi 34B included as upper reference (3x parameters).

Open LLM Leaderboard Scores (Base Model)

Benchmark	Solar 10.7B	Llama 2 13B	Mistral 7B
MMLU (5-shot)	~66%	~55%	~60.1%
ARC-Challenge (25-shot)	~61%	~59%	~60%
HellaSwag (10-shot)	~84%	~82%	~83%
Winogrande (5-shot)	~83%	~76%	~78%

Source: HuggingFace Open LLM Leaderboard (v1), Upstage model card. Scores are approximate; check the leaderboard for latest values.

Honest Assessment

Strengths

- Beats Llama 2 13B on MMLU despite fewer parameters
- Apache 2.0 license (fully open for commercial use)
- Good base for fine-tuning custom models
- DUS approach is cheaper to replicate than training from scratch
- Compact enough to quantize and run on consumer GPUs

Limitations

- Only 4,096 context tokens (short by 2024+ standards)
- Released December 2023; newer models have surpassed it
- Base model not directly useful for chat without fine-tuning
- DUS paper does not report Korean-specific benchmarks for base
- No code-specific training (not competitive for coding tasks)

VRAM by Quantization

Quantization	Model Size	VRAM Required	Quality Loss	Compatible Hardware
FP16	~21 GB	~24 GB	None	RTX 3090/4090, A5000, A100
Q8_0	~11 GB	~13 GB	Minimal	RTX 3090/4090, Apple M2 Pro 16GB
Q4_K_M (recommended)	~6 GB	~7 GB	Small	RTX 3060 12GB, Apple M1 8GB, RTX 4060
Q4_0	~5.5 GB	~6.5 GB	Moderate	RTX 3060, Apple M1 8GB

Sizes are approximate. VRAM includes overhead for context/KV cache at short prompts. Apple Silicon uses unified memory. Ollama defaults to Q4_K_M when you run ollama run solar.

Installation with Ollama

System Requirements

▸

Operating System

Ubuntu 20.04+, macOS 12+ (Apple Silicon supported), Windows 11

▸

RAM

16GB minimum (for quantized), 32GB for FP16

▸

Storage

4GB (Q4_K_M) to 22GB (FP16)

▸

GPU

Any GPU with 6GB+ VRAM for Q4_K_M, or 24GB for FP16

▸

CPU

Apple M1/M2/M3 or modern x86_64 (CPU inference supported via Ollama)

Install Ollama

One-line install on macOS/Linux

$ curl -fsSL https://ollama.com/install.sh | sh

Pull and Run Solar 10.7B

Downloads the Q4_K_M quantized version (~4GB)

$ ollama run solar

Alternative: HuggingFace (FP16)

Full-precision model via transformers library

$ pip install torch transformers accelerate python -c "from transformers import AutoModelForCausalLM, AutoTokenizer; model = AutoModelForCausalLM.from_pretrained('upstage/SOLAR-10.7B-v1.0', device_map='auto'); print('Model loaded')"

Terminal

$ollama run solar

pulling manifest pulling 6b0c4... verifying sha256 digest writing manifest success >>> Send a message (/? for help)

$>>> Explain the DUS architecture used in Solar 10.7B

Depth Up-Scaling (DUS) is the architecture used to create Solar 10.7B. The process works as follows: 1. Start with a pretrained Llama 2 model 2. Duplicate certain transformer layers to increase depth 3. Continue pretraining on additional data This approach leverages existing pretrained weights rather than training from scratch, making it more efficient than training a 10.7B model from random initialization. The result is a model with 48 transformer layers that inherits knowledge from the base Llama 2 architecture while gaining additional capacity from the duplicated layers.

Local AI Alternatives

Solar 10.7B was competitive at release (December 2023), but newer models have since surpassed it. Consider these alternatives if you need the best performance in the 7B-14B range:

Model	Params	MMLU	Context	Why Consider
Qwen 2.5 14B	14B	~79%	128K	Much better MMLU + 32x longer context
Gemma 2 27B	27B	~75%	8K	Better quality, still runnable quantized on 16GB
Mistral Nemo 12B	12B	~68%	128K	Similar size, much longer context
Llama 3 8B	8B	~66%	8K	Similar MMLU with fewer params + 2x context

Solar 10.7B remains a good choice if you specifically need an Apache 2.0 base model for fine-tuning, or are interested in the DUS architecture for research purposes.

Resources & References

Official Sources

HuggingFace: SOLAR-10.7B-v1.0
Official base model weights and model card
arXiv: SOLAR 10.7B Paper
"Scaling Large Language Models with Simple yet Effective Depth Up-Scaling" (Dec 2023)
Upstage AI
Developer company (Seoul, South Korea)
Ollama: Solar
Ollama model library page for Solar

Related Pages on This Site

Solar 10.7B Instruct
The instruction-tuned version for chat and Q&A
Mistral 7B Instruct
Popular 7B competitor from Mistral AI
Llama 3 8B
Newer 8B model from Meta with similar MMLU
Qwen 2.5 14B
Current leader in the 14B class

Frequently Asked Questions

Technical Questions

What is DUS (Depth Up-Scaling)?

DUS is Upstage's method for creating larger models efficiently. It takes a pretrained model (in this case Llama 2), duplicates some of its transformer layers to increase depth from 32 to 48 layers, then continues pretraining. This is cheaper than training a 10.7B model from scratch because the duplicated layers already contain useful learned representations.

How much VRAM do I need?

With Q4_K_M quantization (Ollama default): about 6-7 GB VRAM. This fits on an RTX 3060 12GB, RTX 4060, or Apple M1 with 8GB unified memory. For FP16 (full precision), you need ~24 GB VRAM (RTX 3090/4090 or A100).

Should I use the base or instruct version?

Use the Instruct version unless you plan to fine-tune on your own dataset. The base model outputs raw text completions and does not follow instructions or engage in conversation without additional training.

Practical Questions

Is Solar 10.7B still worth using in 2026?

For general use, newer models like Qwen 2.5, Llama 3, and Gemma 2 offer better performance. However, Solar 10.7B remains relevant if you need an Apache 2.0 base model for fine-tuning, or are researching DUS as a scaling technique.

Can I fine-tune Solar 10.7B?

Yes, and this is the primary use case for the base model. Use LoRA/QLoRA for efficient fine-tuning on consumer hardware. The Apache 2.0 license allows commercial use of fine-tuned derivatives. Tools like Axolotl or HuggingFace TRL work well with this model.

Does Solar 10.7B support Korean?

Solar uses the Llama 2 tokenizer (32K vocabulary), which is primarily English-focused. While Upstage is a Korean company, the base model's Korean capabilities are limited by the tokenizer. The Instruct version has slightly better Korean support from instruction tuning data. For strong Korean NLP, consider models with dedicated Korean tokenizers.

Build Real AI on Your Machine

RAG, agents, NLP, vision, and MLOps - chapters across 20 courses that take you from reading about AI to building AI.

Explore the Learning Path See pricing

Was this helpful?

🎯

AI Learning Path

Go from reading about AI to building with AI

20 structured courses. Hands-on projects. Runs on your machine. Start free.

Start free Browse courses first

Or own it for life — Lifetime $149 $599, pay once

Training your whole team? Get a team quote →

Written by the Local AI Master Team

The team behind Local AI Master

We build Local AI Master around practical, testable local AI workflows: model selection, hardware planning, RAG systems, agents, and MLOps. The goal is to turn scattered tutorials into a structured learning path you can follow on your own hardware.

✓ Local AI Curriculum✓ Hands-On Projects✓ Open Source Contributor

GitHub LinkedIn Twitter

📅 Published: December 13, 2023🔄 Last Updated: March 13, 2026✓ Manually Reviewed

Related Guides

Continue your local AI journey with these comprehensive guides

View All Local AI Guides

Continue Learning: Similar Models

Solar 10.7B Instruct

Instruction-tuned version for chat

Llama 3 8B

Newer 8B base model from Meta

Qwen 2.5 14B

Best in class at 14B parameters

Solar-10.7B DUS Architecture

Depth Up-Scaling: Llama 2 base (32 layers) duplicated to 48 layers (10.7B parameters), then continued pretraining

👤

You

💻

Your ComputerAI Processing

👤

🌐

🏢

Cloud AI: You → Internet → Company Servers

Reading now

Join the discussion

Grab the AI Starter Kit — career roadmap, cheat sheet, setup guide

No spam. Unsubscribe with one click.

🎯