Qwen3 Reranker 4B

Qwen/Qwen3-Reranker-4B

published Jun 2025 · updated Apr 2026

Qwen3 Reranker 4B is a rerank model that re-ranks documents based on their relevance to a query using a cross-encoder architecture.

est. price

~$0.008

/ 1k docs · estimated, set at launch

API providers

downloads / mo

1.8M

license

apache-2.0

specs

Task	Text Reranking
Architecture	Cross-encoder built on Qwen3 foundation model
Parameters	4B
Context Length	32k
License	Apache 2.0

about this model

Qwen3-Reranker-4B is a cross-encoder text reranking model that scores query-document pairs to reorder search results, built on the Qwen3 foundation model. It uses LoRA fine-tuning to preserve the base model’s multilingual understanding and reasoning abilities.

Capabilities

The model supports over 100 languages, including programming languages, and handles a context length of 32,000 tokens. It is instruction-aware, allowing developers to provide custom task-specific prompts (e.g., "Classify whether the document matches the query topic") that typically improve performance by 1–5%. By default, scores are raw logit differences; a Sigmoid activation can be applied to obtain 0–1 probabilities.

Benchmark Results

On standard reranking benchmarks, Qwen3-Reranker-4B achieves the following scores:

Benchmark	Score
MTEB-Retrieval (MTEB-R)	69.76
CMEB-Retrieval (CMEB-R)	75.94
MMTEB-Retrieval (MMTEB-R)	72.74
MLDR	69.97
MTEB-Code	81.20
FollowIR	14.84

It outperforms smaller reranking models (e.g., Jina-multilingual-reranker-v2-base, gte-multilingual-reranker-base, BGE-reranker-v2-m3) across all reported benchmarks by a substantial margin. The model series is released under the Apache 2.0 license.

Model Detail

The 4B-parameter reranker is part of a full spectrum of sizes (0.6B, 4B, 8B) available for both embedding and reranking tasks. For further technical details, refer to the Qwen3 Embedding paper and the official blog.

best for

·Re-ranking search results in multilingual retrieval
·Improving document relevance scoring for RAG systems
·Code retrieval ranking for developer tools
·Cross-lingual document ranking for enterprise search

FAQ

What input format does the Qwen3 Reranker 4B accept?

It accepts pairs of query and document text. You can use Sentence Transformers CrossEncoder with lists of pairs, or tokenize pairs for use with Transformers or vLLM.

Does the reranker support custom instructions?

Yes, it is instruction-aware. By default it uses a web search query instruction, but you can provide a custom prompt for different tasks via the prompt parameter in CrossEncoder or the format_instruction function.

What is the default scoring output of the model?

The model returns raw logit differences by default. To get scores between 0 and 1, you must apply a Sigmoid activation function.

How does Qwen3 Reranker 4B compare to other rerankers?

It outperforms Jina-multilingual-reranker-v2-base, gte-multilingual-reranker-base, and BGE-reranker-v2-m3 on benchmarks like MTEB-R (score 69.76).

How can I call this model via the gigarouter API?

Use the gigarouter OpenAI-compatible endpoint with your API key. Pass the model name and query-document pairs in the request body.

not yet live

We're benchmarking and onboarding Qwen3 Reranker 4B as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related reranker models

compare all →

ms-marco-MiniLM-L6-v2

81.5M dl/mo · live

ms-marco-MiniLM-L4-v2

4.8M dl/mo

gte-reranker-modernbert-base

2.7M dl/mo

ms-marco-MiniLM-L12-v2

2.3M dl/mo

jina-reranker-v2-base-multilingual

1.8M dl/mo · live

mmarco-mMiniLMv2-L12-H384-v1

1.6M dl/mo