Qwen3 Reranker 8B
Qwen/Qwen3-Reranker-8B
published May 2025 · updated Apr 2026
Qwen3 Reranker 8B is a rerank model that re-ranks documents based on query relevance using a cross-encoder architecture, supporting over 100 languages and a 32k context length.
specs
| Task | Text Reranking |
| Architecture | Cross-encoder |
| Parameters | 8B |
| License | Apache 2.0 |
| Context Length | 32k tokens |
| Scoring Mechanism | Logit differences between yes and no tokens |
about this model
Qwen3-Reranker-8B is a cross-encoder reranking model that takes a query and a document pair as input and outputs a relevance score, built on the Qwen3 foundation model with 8 billion parameters and a 32k-token context window.
Capabilities and Strengths
The model supports over 100 natural and programming languages, inheriting the multilingual and reasoning abilities of the Qwen3 series. It is instruction-aware, allowing developers to supply custom instructions to tailor scoring for specific tasks, languages, or scenarios. The scoring mechanism uses logit differences between “yes” and “no” tokens; raw scores can be passed through a Sigmoid function for 0-1 probabilities.
The reranker is trained via a multi-stage pipeline combining large-scale contrastive pre-training, multi-task learning, and instruction tuning. It uses LoRA fine-tuning to preserve base model capabilities.
Benchmark Performance
The following table presents evaluation results from the model card, based on top-100 candidates retrieved by Qwen3-Embedding-0.6B. Scores are reported across multilingual and code retrieval subsets.
| Model | Param | MTEB-R | CMTEB-R | MMTEB-R | MLDR | MTEB-Code | FollowIR |
|---|---|---|---|---|---|---|---|
| Qwen3-Reranker-8B | 8B | 69.02 | 77.45 | 72.94 | 70.19 | 81.22 | 8.05 |
| Qwen3-Reranker-4B | 4B | 69.76 | 75.94 | 72.74 | 69.97 | 81.20 | 14.84 |
| Qwen3-Reranker-0.6B | 0.6B | 65.80 | 71.31 | 66.36 | 67.28 | 73.42 | 5.41 |
Released under the Apache 2.0 license, this model is hosted by gigarouter as a managed API, enabling developers to integrate reranking without managing infrastructure.
best for
- ·Multilingual search reranking for enterprise retrieval systems
- ·Code retrieval ranking in RAG pipelines
- ·Cross-lingual document ranking for global search
- ·Re-ranking top candidates from dense retrieval for high-accuracy QA
FAQ
It accepts query-document pairs as text, formatted with instructions. Use the gigarouter OpenAI-compatible endpoint with an API key.
It outputs raw logit differences between yes and no tokens. Pass through Sigmoid for 0-1 probabilities.
32k tokens per input pair.
Over 100 languages, including major global languages and programming languages.
Apache 2.0.
We're benchmarking and onboarding Qwen3 Reranker 8B as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.