Qwen3 Embedding 0.6B

Qwen/Qwen3-Embedding-0.6B

published Jun 2025 · updated Apr 2026

Qwen3 Embedding 0.6B is a text embedding model that converts text into dense vector representations for tasks like retrieval, classification, and clustering.

price

$0.008

/ 1M tokens

throughput

581 embeds/s

specs

Task	Text Embedding
Architecture	Dense dual-encoder transformer
Parameters	0.6B
License	Apache 2.0
Context Length	32K tokens
Embedding Dimension	Up to 1024 (user-defined from 32 to 1024)

about this model

Qwen3-Embedding-0.6B is a text embedding model that converts text into dense vector representations for retrieval, classification, clustering, and bitext mining. Built on the Qwen3 foundation, it inherits strong multilingual and reasoning capabilities, supporting over 100 languages and a context length of 32,000 tokens.

The model uses a dual-encoder architecture and extracts embeddings from the final token. It supports user-defined output dimensions from 32 to 1024 (Matryoshka Representation Learning) and is instruction-aware—developers can prepend task-specific prompts to queries, typically improving retrieval by 1–5%.

Key capabilities

Multilingual and cross-lingual retrieval across 100+ natural languages and multiple programming languages.
Code retrieval with a score of 75.41 on MTEB-Code when used as a dense retriever.
State-of-the-art reranking: achieves 61.82 on MTEB-R, 71.02 on CMTEB-R, 64.64 on MMTEB-R, 50.26 on MLDR, and 5.09 on FollowIR (top-100 candidates).
Apache 2.0 license – permissive for commercial and research use.

Diagram of Qwen3 Embedding model series highlighting sizes and features

Benchmark highlights

The 8B member of the series ranks No.1 on the MTEB multilingual leaderboard (score 70.58, June 2025). The 0.6B model delivers a competitive balance of efficiency and accuracy across multilingual retrieval and coding tasks.

The model was trained via a multi-stage pipeline combining large-scale unsupervised pre-training with supervised fine-tuning on data synthesized by the Qwen3 LLMs. It has been downloaded over 10 million times on Hugging Face.

best for

·Multilingual text retrieval across 100+ languages
·Code retrieval for programming languages
·Text classification and clustering

FAQ

What is the context length of Qwen3 Embedding 0.6B?

The model supports up to 32K tokens (32,768 tokens) per input.

How many parameters does this model have?

It has 0.6 billion parameters.

What languages does Qwen3 Embedding 0.6B support?

It supports over 100 languages, including natural and programming languages.

Is the model free to use under an open license?

Yes, it is released under the Apache 2.0 license.

How can I call this model via the gigarouter API?

Use the gigarouter OpenAI-compatible endpoint with your API key, providing input text and selecting the model ID 'Qwen3-Embedding-0.6B'.

call it

# OpenAI client - just change base_url
from openai import OpenAI
client = OpenAI(base_url="https://gigarouter.ai/v1", api_key=KEY)
v = client.embeddings.create(model="Qwen/Qwen3-Embedding-0.6B", input=["hello world"])
print(v.data[0].embedding[:4])

get a key + $25 free →model card ↗all models