Qwen3 Reranker 8B

Qwen/Qwen3-Reranker-8B

published May 2025 · updated Apr 2026

Qwen3 Reranker 8B is a rerank model that re-ranks documents based on query relevance using a cross-encoder architecture, supporting over 100 languages and a 32k context length.

est. price

~$0.008

/ 1k docs · estimated, set at launch

API providers

downloads / mo

license

apache-2.0

specs

Task	Text Reranking
Architecture	Cross-encoder
Parameters	8B
License	Apache 2.0
Context Length	32k tokens
Scoring Mechanism	Logit differences between yes and no tokens

about this model

Qwen3-Reranker-8B is a cross-encoder reranking model that takes a query and a document pair as input and outputs a relevance score, built on the Qwen3 foundation model with 8 billion parameters and a 32k-token context window.

Capabilities and Strengths

The model supports over 100 natural and programming languages, inheriting the multilingual and reasoning abilities of the Qwen3 series. It is instruction-aware, allowing developers to supply custom instructions to tailor scoring for specific tasks, languages, or scenarios. The scoring mechanism uses logit differences between “yes” and “no” tokens; raw scores can be passed through a Sigmoid function for 0-1 probabilities.

The reranker is trained via a multi-stage pipeline combining large-scale contrastive pre-training, multi-task learning, and instruction tuning. It uses LoRA fine-tuning to preserve base model capabilities.

Benchmark Performance

The following table presents evaluation results from the model card, based on top-100 candidates retrieved by Qwen3-Embedding-0.6B. Scores are reported across multilingual and code retrieval subsets.

Model	Param	MTEB-R	CMTEB-R	MMTEB-R	MLDR	MTEB-Code	FollowIR
Qwen3-Reranker-8B	8B	69.02	77.45	72.94	70.19	81.22	8.05
Qwen3-Reranker-4B	4B	69.76	75.94	72.74	69.97	81.20	14.84
Qwen3-Reranker-0.6B	0.6B	65.80	71.31	66.36	67.28	73.42	5.41

Released under the Apache 2.0 license, this model is hosted by gigarouter as a managed API, enabling developers to integrate reranking without managing infrastructure.

best for

·Multilingual search reranking for enterprise retrieval systems
·Code retrieval ranking in RAG pipelines
·Cross-lingual document ranking for global search
·Re-ranking top candidates from dense retrieval for high-accuracy QA

FAQ

What is the input format for Qwen3 Reranker 8B?

It accepts query-document pairs as text, formatted with instructions. Use the gigarouter OpenAI-compatible endpoint with an API key.

How does it output relevance scores?

It outputs raw logit differences between yes and no tokens. Pass through Sigmoid for 0-1 probabilities.

What is the context length?

32k tokens per input pair.

What languages does it support?

Over 100 languages, including major global languages and programming languages.

What license is it released under?

Apache 2.0.

not yet live

We're benchmarking and onboarding Qwen3 Reranker 8B as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related reranker models

compare all →

ms-marco-MiniLM-L6-v2

81.5M dl/mo · live

ms-marco-MiniLM-L4-v2

4.8M dl/mo

gte-reranker-modernbert-base

2.7M dl/mo

ms-marco-MiniLM-L12-v2

2.3M dl/mo

jina-reranker-v2-base-multilingual

1.8M dl/mo · live

Qwen3-Reranker-4B

1.8M dl/mo