Qwen3 Embedding 0.6B
Qwen/Qwen3-Embedding-0.6B
published Jun 2025 · updated Apr 2026
Qwen3 Embedding 0.6B is a text embedding model that converts text into dense vector representations for tasks like retrieval, classification, and clustering.
specs
| Task | Text Embedding |
| Architecture | Dense dual-encoder transformer |
| Parameters | 0.6B |
| License | Apache 2.0 |
| Context Length | 32K tokens |
| Embedding Dimension | Up to 1024 (user-defined from 32 to 1024) |
about this model
Qwen3-Embedding-0.6B is a text embedding model that converts text into dense vector representations for retrieval, classification, clustering, and bitext mining. Built on the Qwen3 foundation, it inherits strong multilingual and reasoning capabilities, supporting over 100 languages and a context length of 32,000 tokens.
The model uses a dual-encoder architecture and extracts embeddings from the final token. It supports user-defined output dimensions from 32 to 1024 (Matryoshka Representation Learning) and is instruction-aware—developers can prepend task-specific prompts to queries, typically improving retrieval by 1–5%.
Key capabilities
- Multilingual and cross-lingual retrieval across 100+ natural languages and multiple programming languages.
- Code retrieval with a score of 75.41 on MTEB-Code when used as a dense retriever.
- State-of-the-art reranking: achieves 61.82 on MTEB-R, 71.02 on CMTEB-R, 64.64 on MMTEB-R, 50.26 on MLDR, and 5.09 on FollowIR (top-100 candidates).
- Apache 2.0 license – permissive for commercial and research use.
Benchmark highlights
The 8B member of the series ranks No.1 on the MTEB multilingual leaderboard (score 70.58, June 2025). The 0.6B model delivers a competitive balance of efficiency and accuracy across multilingual retrieval and coding tasks.
The model was trained via a multi-stage pipeline combining large-scale unsupervised pre-training with supervised fine-tuning on data synthesized by the Qwen3 LLMs. It has been downloaded over 10 million times on Hugging Face.
best for
- ·Multilingual text retrieval across 100+ languages
- ·Code retrieval for programming languages
- ·Text classification and clustering
FAQ
The model supports up to 32K tokens (32,768 tokens) per input.
It has 0.6 billion parameters.
It supports over 100 languages, including natural and programming languages.
Yes, it is released under the Apache 2.0 license.
Use the gigarouter OpenAI-compatible endpoint with your API key, providing input text and selecting the model ID 'Qwen3-Embedding-0.6B'.
# OpenAI client - just change base_url from openai import OpenAI client = OpenAI(base_url="https://gigarouter.ai/v1", api_key=KEY) v = client.embeddings.create(model="Qwen/Qwen3-Embedding-0.6B", input=["hello world"]) print(v.data[0].embedding[:4])
try it live
runs the real hosted model on a shared demo allowance · get your own key + $25 free →