skip to content
gigarouter gigarouter

Qwen3 Embedding 0.6B

Qwen/Qwen3-Embedding-0.6B

published Jun 2025 · updated Apr 2026

Qwen3 Embedding 0.6B is a text embedding model that converts text into dense vector representations for tasks like retrieval, classification, and clustering.

price
$0.008
/ 1M tokens
throughput
581 embeds/s

specs

TaskText Embedding
ArchitectureDense dual-encoder transformer
Parameters0.6B
LicenseApache 2.0
Context Length32K tokens
Embedding DimensionUp to 1024 (user-defined from 32 to 1024)

about this model

Qwen3-Embedding-0.6B is a text embedding model that converts text into dense vector representations for retrieval, classification, clustering, and bitext mining. Built on the Qwen3 foundation, it inherits strong multilingual and reasoning capabilities, supporting over 100 languages and a context length of 32,000 tokens.

The model uses a dual-encoder architecture and extracts embeddings from the final token. It supports user-defined output dimensions from 32 to 1024 (Matryoshka Representation Learning) and is instruction-aware—developers can prepend task-specific prompts to queries, typically improving retrieval by 1–5%.

Key capabilities

  • Multilingual and cross-lingual retrieval across 100+ natural languages and multiple programming languages.
  • Code retrieval with a score of 75.41 on MTEB-Code when used as a dense retriever.
  • State-of-the-art reranking: achieves 61.82 on MTEB-R, 71.02 on CMTEB-R, 64.64 on MMTEB-R, 50.26 on MLDR, and 5.09 on FollowIR (top-100 candidates).
  • Apache 2.0 license – permissive for commercial and research use.
Diagram of Qwen3 Embedding model series highlighting sizes and features

Benchmark highlights

The 8B member of the series ranks No.1 on the MTEB multilingual leaderboard (score 70.58, June 2025). The 0.6B model delivers a competitive balance of efficiency and accuracy across multilingual retrieval and coding tasks.

The model was trained via a multi-stage pipeline combining large-scale unsupervised pre-training with supervised fine-tuning on data synthesized by the Qwen3 LLMs. It has been downloaded over 10 million times on Hugging Face.

best for

FAQ

What is the context length of Qwen3 Embedding 0.6B?

The model supports up to 32K tokens (32,768 tokens) per input.

How many parameters does this model have?

It has 0.6 billion parameters.

What languages does Qwen3 Embedding 0.6B support?

It supports over 100 languages, including natural and programming languages.

Is the model free to use under an open license?

Yes, it is released under the Apache 2.0 license.

How can I call this model via the gigarouter API?

Use the gigarouter OpenAI-compatible endpoint with your API key, providing input text and selecting the model ID 'Qwen3-Embedding-0.6B'.

call it
# OpenAI client - just change base_url
from openai import OpenAI
client = OpenAI(base_url="https://gigarouter.ai/v1", api_key=KEY)
v = client.embeddings.create(model="Qwen/Qwen3-Embedding-0.6B", input=["hello world"])
print(v.data[0].embedding[:4])

try it live

runs the real hosted model on a shared demo allowance · get your own key + $25 free →

related embeddings models

compare all →