Qwen3 Reranker 0.6B
tomaarsen/Qwen3-Reranker-0.6B-seq-cls
published Jun 2025 · updated Jul 2025
Qwen3 Reranker 0.6B is a rerank model that scores query-document pairs for relevance, supporting over 100 languages and a 32k context length.
specs
| Task | Text Reranking |
| Architecture | Qwen3 Dense Transformer |
| Parameters | 0.6B |
| Context Length | 32K tokens |
| Languages | 100+ languages |
about this model
Qwen3-Reranker-0.6B-seq-cls is a text reranking model that evaluates the relevance of a document to a given query, producing a score between 0 and 1. It is a sequence classification adaptation of the Qwen3-Reranker-0.6B model, part of the Qwen3 Embedding series built on the dense foundational Qwen3 models.
The model supports over 100 languages and has a context length of 32K tokens. It is instruction-aware, meaning custom instructions can be prepended to queries to improve performance for specific tasks, languages, or scenarios. The original Qwen3 reranking model family excels in text retrieval scenarios; the 8B embedding variant ranks No.1 on the MTEB multilingual leaderboard (score 70.58 as of June 5, 2025).
Key strengths include:
- Multilingual capability: Supports 100+ natural languages and programming languages, enabling robust cross-lingual and code retrieval.
- Instruction-aware scoring: User-defined instructions can improve relevance judgment by an estimated 1% to 5% over no-instruction baselines.
- Efficient 0.6B parameter size: Suitable for latency-sensitive applications while maintaining strong reranking quality.
The model is hosted as a managed, OpenAI-compatible API on gigarouter, requiring no local installation or infrastructure.
best for
- ·Re-ranking search results for multilingual web search
- ·Improving relevance in retrieval-augmented generation (RAG) pipelines
- ·Cross-lingual document ranking for enterprise search
FAQ
It is best for re-ranking query-document pairs by relevance, particularly in multilingual and cross-lingual search scenarios.
With 0.6B parameters, it is lightweight and fast, making it suitable for efficient deployment while maintaining strong performance.
Input: a query and document pair with an optional instruction. Output: a relevance score between 0 and 1 obtained by applying sigmoid to the sequence classification logits.
Use the gigarouter OpenAI-compatible endpoint with your API key, sending query-document pairs in the required format.
Yes, it supports a context length of up to 32K tokens.
We're benchmarking and onboarding Qwen3 Reranker 0.6B as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.