models / reranker · coming soon

MS MARCO MiniLM-L12 v2

cross-encoder/ms-marco-MiniLM-L12-v2

published Mar 2022 · updated Aug 2025

MS MARCO MiniLM-L12 v2 is a cross-encoder reranker model that scores query-passage pairs for information retrieval.

est. price

~$0.008

/ 1k docs · estimated, set at launch

API providers

downloads / mo

2.3M

license

apache-2.0

specs

Task	Passage Re-ranking (Cross-Encoder)
Architecture	MiniLM-L12 (12-layer transformer)
Training Data	MS MARCO Passage Ranking
License	Not specified

about this model

cross-encoder/ms-marco-MiniLM-L12-v2 is a cross-encoder reranking model that scores the relevance of a query–passage pair, enabling a second-stage reordering of candidate documents retrieved by a first-stage retrieval system.

Model Description

This model is a 12‑layer MiniLM cross‑encoder fine‑tuned on the MS Marco Passage Ranking dataset. It accepts a query and a passage as input and outputs a relevance score. In a typical retrieve‑and‑rerank pipeline, a fast first‑stage retriever (e.g., Elasticsearch or a dense retriever) returns a set of candidate passages; the cross‑encoder then re‑scores a subset of those candidates to improve ranking precision.

Key Strengths

Strong ranking accuracy on standard benchmarks
Balanced trade‑off between latency and quality – ~960 documents per second on a V100 GPU
Part of the v2 family of MS Marco cross‑encoders, which outperform their v1 counterparts

Benchmark Performance

Dataset	Metric	Score
TREC Deep Learning 2019	NDCG@10	74.31
MS Marco Passage Dev	MRR@10	39.02

Additional Context

The model was trained on over one million real anonymized Bing queries from the MS Marco dataset. It is designed for English‑language passage reranking and is hosted as a managed API on Gigarouter, requiring no local infrastructure or dependency installation.

best for

·Re-ranking top-k results from an initial retrieval system (e.g., BM25)
·Improving relevance scoring in enterprise search or document retrieval pipelines
·Building a question-answering system with passage selection from a candidate set

FAQ

What input format does this model expect?

It expects a pair of texts (query and passage) and returns a relevance score. Example: model.predict([("query", "passage")])

How does this model compare to other MiniLM v2 Cross-Encoders?

It is the largest MiniLM v2 variant (12 layers) and achieves the highest MRR@10 (39.02) on MS MARCO Dev, but processes fewer documents per second (960 on V100) compared to smaller variants.

What is the license for this model?

The model card does not specify a license. Please check the model repository for any license information.

How can I use this model through the gigarouter API?

Send a POST request to the gigarouter OpenAI-compatible endpoint with your API key, including the model name "cross-encoder/ms-marco-MiniLM-L12-v2" and a list of query-passage pairs.

On what dataset was this model trained?

It was trained on the MS MARCO Passage Ranking dataset, which contains over 1 million real Bing queries and human-generated answers.

not yet live

We're benchmarking and onboarding MS MARCO MiniLM-L12 v2 as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related reranker models

compare all →

ms-marco-MiniLM-L6-v2

81.5M dl/mo · live

ms-marco-MiniLM-L4-v2

4.8M dl/mo

gte-reranker-modernbert-base

2.7M dl/mo

jina-reranker-v2-base-multilingual

1.8M dl/mo · live

Qwen3-Reranker-4B

1.8M dl/mo

mmarco-mMiniLMv2-L12-H384-v1

1.6M dl/mo