MS Marco MiniLM L2 v2
cross-encoder/ms-marco-MiniLM-L2-v2
published Mar 2022 · updated Aug 2025
MS Marco MiniLM L2 v2 is a cross-encoder rerank model that scores query-passage pairs for information retrieval.
specs
| Task | Reranking / Passage Ranking |
| Architecture | MiniLM-L2-v2 cross-encoder |
| License | Not specified in card |
about this model
cross-encoder/ms-marco-MiniLM-L2-v2 is a cross-encoder reranking model trained on the MS Marco Passage Ranking dataset. Given a query and a set of candidate passages (e.g., retrieved via ElasticSearch), the model computes a relevance score for each query-passage pair, enabling reordering of results by descending score.
Key Strengths
- Optimized for the reranking stage in information retrieval pipelines, following the retrieve-and-rerank paradigm.
- Small model footprint with fast inference: processes approximately 4,100 documents per second on a V100 GPU.
- Competitive accuracy relative to larger models, making it suitable for latency-sensitive applications.
Benchmark Results
The following table summarizes performance on the TREC Deep Learning 2019 and MS Marco Passage Reranking datasets, as reported in the model card. Runtime measured on a V100 GPU.
| Model-Name | NDCG@10 (TREC DL 19) | MRR@10 (MS Marco Dev) | Docs / Sec |
|---|---|---|---|
| Version 2 models | |||
| cross-encoder/ms-marco-TinyBERT-L2-v2 | 69.84 | 32.56 | 9000 |
| cross-encoder/ms-marco-MiniLM-L2-v2 | 71.01 | 34.85 | 4100 |
| cross-encoder/ms-marco-MiniLM-L4-v2 | 73.04 | 37.70 | 2500 |
| cross-encoder/ms-marco-MiniLM-L6-v2 | 74.30 | 39.01 | 1800 |
| cross-encoder/ms-marco-MiniLM-L12-v2 | 74.31 | 39.02 | 960 |
| Version 1 models | |||
| cross-encoder/ms-marco-TinyBERT-L2 | 67.43 | 30.15 | 9000 |
| cross-encoder/ms-marco-TinyBERT-L4 | 68.09 | 34.50 | 2900 |
| cross-encoder/ms-marco-TinyBERT-L6 | 69.57 | 36.13 | 680 |
| cross-encoder/ms-marco-electra-base | 71.99 | 36.41 | 340 |
| Other models | |||
| nboost/pt-tinybert-msmarco | 63.63 | 28.80 | 2900 |
| nboost/pt-bert-base-uncased-msmarco | 70.94 | 34.75 | 340 |
| nboost/pt-bert-large-msmarco | 73.36 | 36.48 | 100 |
| Capreolus/electra-base-msmarco | 71.23 | 36.89 | 340 |
| amberoad/bert-multilingual-passage-reranking-msmarco | 68.40 | 35.54 | 330 |
| sebastian-hofstaetter/distilbert-cat-margin_mse-T2-msmarco | 72.82 | 37.88 | 720 |
best for
- ·Reranking search results for a query-passage pair
- ·Improving retrieval accuracy in a retrieve-and-rerank pipeline
FAQ
The model accepts a query and a passage as a pair of strings, and outputs a relevance score.
It processes about 4100 docs per second on a V100 GPU, faster than larger MiniLM-L6 and L12 variants.
The model card does not specify a license.
Use the gigarouter OpenAI-compatible endpoint with your API key and the model name cross-encoder/ms-marco-MiniLM-L2-v2.
We're benchmarking and onboarding MS Marco MiniLM L2 v2 as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.