Qwen3 Reranker 4B
Qwen/Qwen3-Reranker-4B
published Jun 2025 · updated Apr 2026
Qwen3 Reranker 4B is a rerank model that re-ranks documents based on their relevance to a query using a cross-encoder architecture.
specs
| Task | Text Reranking |
| Architecture | Cross-encoder built on Qwen3 foundation model |
| Parameters | 4B |
| Context Length | 32k |
| License | Apache 2.0 |
about this model
Qwen3-Reranker-4B is a cross-encoder text reranking model that scores query-document pairs to reorder search results, built on the Qwen3 foundation model. It uses LoRA fine-tuning to preserve the base model’s multilingual understanding and reasoning abilities.
Capabilities
The model supports over 100 languages, including programming languages, and handles a context length of 32,000 tokens. It is instruction-aware, allowing developers to provide custom task-specific prompts (e.g., "Classify whether the document matches the query topic") that typically improve performance by 1–5%. By default, scores are raw logit differences; a Sigmoid activation can be applied to obtain 0–1 probabilities.
Benchmark Results
On standard reranking benchmarks, Qwen3-Reranker-4B achieves the following scores:
| Benchmark | Score |
|---|---|
| MTEB-Retrieval (MTEB-R) | 69.76 |
| CMEB-Retrieval (CMEB-R) | 75.94 |
| MMTEB-Retrieval (MMTEB-R) | 72.74 |
| MLDR | 69.97 |
| MTEB-Code | 81.20 |
| FollowIR | 14.84 |
It outperforms smaller reranking models (e.g., Jina-multilingual-reranker-v2-base, gte-multilingual-reranker-base, BGE-reranker-v2-m3) across all reported benchmarks by a substantial margin. The model series is released under the Apache 2.0 license.
Model Detail
The 4B-parameter reranker is part of a full spectrum of sizes (0.6B, 4B, 8B) available for both embedding and reranking tasks. For further technical details, refer to the Qwen3 Embedding paper and the official blog.
best for
- ·Re-ranking search results in multilingual retrieval
- ·Improving document relevance scoring for RAG systems
- ·Code retrieval ranking for developer tools
- ·Cross-lingual document ranking for enterprise search
FAQ
It accepts pairs of query and document text. You can use Sentence Transformers CrossEncoder with lists of pairs, or tokenize pairs for use with Transformers or vLLM.
Yes, it is instruction-aware. By default it uses a web search query instruction, but you can provide a custom prompt for different tasks via the prompt parameter in CrossEncoder or the format_instruction function.
The model returns raw logit differences by default. To get scores between 0 and 1, you must apply a Sigmoid activation function.
It outperforms Jina-multilingual-reranker-v2-base, gte-multilingual-reranker-base, and BGE-reranker-v2-m3 on benchmarks like MTEB-R (score 69.76).
Use the gigarouter OpenAI-compatible endpoint with your API key. Pass the model name and query-document pairs in the request body.
We're benchmarking and onboarding Qwen3 Reranker 4B as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.