skip to content
gigarouter gigarouter
models / reranker · coming soon

MS MARCO MiniLM-L12 v2

cross-encoder/ms-marco-MiniLM-L12-v2

published Mar 2022 · updated Aug 2025

MS MARCO MiniLM-L12 v2 is a cross-encoder reranker model that scores query-passage pairs for information retrieval.

est. price
~$0.008
/ 1k docs · estimated, set at launch
API providers
0
downloads / mo
2.3M
license
apache-2.0

specs

TaskPassage Re-ranking (Cross-Encoder)
ArchitectureMiniLM-L12 (12-layer transformer)
Training DataMS MARCO Passage Ranking
LicenseNot specified

about this model

cross-encoder/ms-marco-MiniLM-L12-v2 is a cross-encoder reranking model that scores the relevance of a query–passage pair, enabling a second-stage reordering of candidate documents retrieved by a first-stage retrieval system.

Model Description

This model is a 12‑layer MiniLM cross‑encoder fine‑tuned on the MS Marco Passage Ranking dataset. It accepts a query and a passage as input and outputs a relevance score. In a typical retrieve‑and‑rerank pipeline, a fast first‑stage retriever (e.g., Elasticsearch or a dense retriever) returns a set of candidate passages; the cross‑encoder then re‑scores a subset of those candidates to improve ranking precision.

Key Strengths

  • Strong ranking accuracy on standard benchmarks
  • Balanced trade‑off between latency and quality – ~960 documents per second on a V100 GPU
  • Part of the v2 family of MS Marco cross‑encoders, which outperform their v1 counterparts

Benchmark Performance

Dataset Metric Score
TREC Deep Learning 2019 NDCG@10 74.31
MS Marco Passage Dev MRR@10 39.02

Additional Context

The model was trained on over one million real anonymized Bing queries from the MS Marco dataset. It is designed for English‑language passage reranking and is hosted as a managed API on Gigarouter, requiring no local infrastructure or dependency installation.

best for

FAQ

What input format does this model expect?

It expects a pair of texts (query and passage) and returns a relevance score. Example: model.predict([("query", "passage")])

How does this model compare to other MiniLM v2 Cross-Encoders?

It is the largest MiniLM v2 variant (12 layers) and achieves the highest MRR@10 (39.02) on MS MARCO Dev, but processes fewer documents per second (960 on V100) compared to smaller variants.

What is the license for this model?

The model card does not specify a license. Please check the model repository for any license information.

How can I use this model through the gigarouter API?

Send a POST request to the gigarouter OpenAI-compatible endpoint with your API key, including the model name "cross-encoder/ms-marco-MiniLM-L12-v2" and a list of query-passage pairs.

On what dataset was this model trained?

It was trained on the MS MARCO Passage Ranking dataset, which contains over 1 million real Bing queries and human-generated answers.

not yet live

We're benchmarking and onboarding MS MARCO MiniLM-L12 v2 as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related reranker models

compare all →