Qwen3 Reranker 0.6B

tomaarsen/Qwen3-Reranker-0.6B-seq-cls

published Jun 2025 · updated Jul 2025

Qwen3 Reranker 0.6B is a rerank model that scores query-document pairs for relevance, supporting over 100 languages and a 32k context length.

est. price

~$0.008

/ 1k docs · estimated, set at launch

API providers

downloads / mo

262.5K

license

apache-2.0

specs

Task	Text Reranking
Architecture	Qwen3 Dense Transformer
Parameters	0.6B
Context Length	32K tokens
Languages	100+ languages

about this model

Qwen3-Reranker-0.6B-seq-cls is a text reranking model that evaluates the relevance of a document to a given query, producing a score between 0 and 1. It is a sequence classification adaptation of the Qwen3-Reranker-0.6B model, part of the Qwen3 Embedding series built on the dense foundational Qwen3 models.

The model supports over 100 languages and has a context length of 32K tokens. It is instruction-aware, meaning custom instructions can be prepended to queries to improve performance for specific tasks, languages, or scenarios. The original Qwen3 reranking model family excels in text retrieval scenarios; the 8B embedding variant ranks No.1 on the MTEB multilingual leaderboard (score 70.58 as of June 5, 2025).

Key strengths include:

Multilingual capability: Supports 100+ natural languages and programming languages, enabling robust cross-lingual and code retrieval.
Instruction-aware scoring: User-defined instructions can improve relevance judgment by an estimated 1% to 5% over no-instruction baselines.
Efficient 0.6B parameter size: Suitable for latency-sensitive applications while maintaining strong reranking quality.

The model is hosted as a managed, OpenAI-compatible API on gigarouter, requiring no local installation or infrastructure.

best for

·Re-ranking search results for multilingual web search
·Improving relevance in retrieval-augmented generation (RAG) pipelines
·Cross-lingual document ranking for enterprise search

FAQ

What is this model best for?

It is best for re-ranking query-document pairs by relevance, particularly in multilingual and cross-lingual search scenarios.

How does it compare in size and speed to larger rerank models?

With 0.6B parameters, it is lightweight and fast, making it suitable for efficient deployment while maintaining strong performance.

What is the input/output format?

Input: a query and document pair with an optional instruction. Output: a relevance score between 0 and 1 obtained by applying sigmoid to the sequence classification logits.

How can I call this model via the API?

Use the gigarouter OpenAI-compatible endpoint with your API key, sending query-document pairs in the required format.

Does it support long context documents?

Yes, it supports a context length of up to 32K tokens.

not yet live

We're benchmarking and onboarding Qwen3 Reranker 0.6B as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related reranker models

compare all →

ms-marco-MiniLM-L6-v2

81.5M dl/mo · live

ms-marco-MiniLM-L4-v2

4.8M dl/mo

gte-reranker-modernbert-base

2.7M dl/mo

ms-marco-MiniLM-L12-v2

2.3M dl/mo

jina-reranker-v2-base-multilingual

1.8M dl/mo · live

Qwen3-Reranker-4B

1.8M dl/mo