Qwen3 Reranker 0.6B

Qwen/Qwen3-Reranker-0.6B

published May 2025 · updated Apr 2026

Qwen3 Reranker 0.6B is a multilingual reranking model that reorders documents based on relevance to a query, supporting over 100 languages and a 32k context length.

price

$0.008

/ 1k docs

throughput

660 docs/s

specs

Task	Text Reranking
Architecture	Dense Transformer (Qwen3)
Parameters	0.6B
Context Length	32k tokens
Supported Languages	100+ languages
License	Apache 2.0

about this model

Qwen3-Reranker-0.6B is a multilingual text reranking model that reorders candidate documents by relevance to a given query, supporting over 100 languages and a context length of 32K tokens. It is part of the Qwen3 Embedding series, built on the Qwen3 dense foundation models and designed for text ranking tasks. The model is instruction-aware, allowing user-defined prompts to tailor behavior for specific tasks, languages, or domains. This capability typically yields a 1% to 5% improvement over default instructions.

Performance Benchmarks

Evaluated using top-100 candidates retrieved by Qwen3-Embedding-0.6B, the model achieves the following scores:

MTEB-R: 65.80
CMTEB-R (Chinese): 71.31
MMTEB-R (Multilingual): 66.36
MLDR: 67.28
MTEB-Code: 73.42
FollowIR: 5.41

Comparison with Other Rerankers

On the same evaluation setup, Qwen3-Reranker-0.6B outperforms comparable models:

Model	MTEB-R	CMTEB-R	MMTEB-R	MLDR	MTEB-Code	FollowIR
Qwen3-Reranker-0.6B	65.80	71.31	66.36	67.28	73.42	5.41
Jina-multilingual-reranker-v2-base	62.09	66.31	61.54	61.96	64.11	4.57
gte-multilingual-reranker-base	63.32	68.73	63.50	65.17	65.32	5.77
BGE-reranker-v2-m3	63.53	65.56	62.34	61.25	65.08	3.64

Training and Availability

Training uses a multi-stage pipeline combining large-scale unsupervised pre-training, supervised fine-tuning on high-quality data, and model merging. The Qwen3 LLMs were also employed to synthesize training data. The model is released under the Apache 2.0 license. For further details, see the technical report (arXiv:2506.05176) and the blog post.

best for

·Reordering search results for multilingual web search
·Improving document retrieval in RAG pipelines across 100+ languages
·Code retrieval: ranking code snippets relevant to a natural language query

FAQ

What is Qwen3 Reranker 0.6B best for?

It is best for reordering documents in multilingual retrieval tasks, including web search, RAG pipelines, and code retrieval, supporting over 100 languages.

How does its size affect speed?

With 0.6B parameters and 32k context, it is efficient and faster than larger 4B/8B variants, suitable for latency-sensitive applications.

What is the license for Qwen3 Reranker 0.6B?

It is released under the Apache 2.0 license.

What is the input and output format?

Input is a query and a list of documents. Output is a relevance score (e.g., logit or probability) for each document, indicating relevance to the query.

How do I call this model via the gigarouter API?

Use the gigarouter OpenAI-compatible endpoint with your API key. Send a POST request to the rerank endpoint with the query and documents in the request body.

call it

# rerank documents by relevance; billed per document
curl https://gigarouter.ai/v1/rerank \
  -H "Authorization: Bearer $GR_KEY" \
  -d '{"model":"Qwen/Qwen3-Reranker-0.6B","query":"capital of France",
       "documents":["Paris is the capital of France.","Bananas are yellow."]}'

get a key + $25 free →model card ↗all models