MMarco MiniLMv2 L12 H384 Reranker

cross-encoder/mmarco-mMiniLMv2-L12-H384-v1

published Jun 2022 · updated Apr 2025

MMarco MiniLMv2 L12 H384 Reranker is a cross-encoder rerank model trained on the multilingual MMARCO dataset for information retrieval.

est. price

~$0.008

/ 1k docs · estimated, set at launch

API providers

downloads / mo

1.6M

license

apache-2.0

specs

Task	Reranking (cross-encoder)
Architecture	Multilingual MiniLMv2 (distilled from XLM-R Large)
Parameters	~22M
License	MIT

about this model

cross-encoder/mmarco-mMiniLMv2-L12-H384-v1 is a cross-encoder reranker model designed for multilingual information retrieval. It was trained on the MMARCO dataset, a machine translation of the MS MARCO passage ranking dataset into 14 languages using Google Translate. The base model is the multilingual MiniLMv2 (nreimers/mMiniLMv2-L12-H384-distilled-from-XLMR-Large), which provides a compact and efficient foundation for cross-lingual sentence pair scoring.

Multilingual Reranking

The model accepts a query-passage pair and outputs a relevance score. It is intended for use in a two-stage retrieval pipeline: first retrieve a set of candidate passages (e.g., via sparse or dense retrieval), then rerank them with this cross-encoder to obtain a more accurate ordering. In experiments, the model demonstrated strong performance not only on the 14 translated languages but also on additional languages not seen during training, indicating useful cross-lingual transfer.

Key Strengths

Multilingual: covers 14 languages from the MMARCO dataset and generalises to others.
Efficient: based on the lightweight MiniLMv2 architecture (12 layers, 384 hidden dimensions).
Proven framework: built using the SentenceTransformers library, with training code and usage examples available from the UKPLab repository.

Hosted API

On gigarouter, this model is available as a managed, OpenAI-compatible API. You submit a query and a list of passages, and the API returns relevance scores for each pair. No local installation, hardware, or model weights management is required.

best for

·Multilingual search result reranking
·Cross-lingual information retrieval
·Improving top-k retrieval precision

FAQ

What is this model best used for?

It is best for reranking passages retrieved by a first-stage retriever, especially in multilingual or cross-lingual search scenarios.

How does it compare in size and speed to other rerankers?

With ~22M parameters, it is a compact model offering fast inference suitable for production reranking pipelines.

What are the license terms?

The model is released under the MIT license, allowing free use, modification, and distribution.

What input format does the model expect?

It expects pairs of (query, passage) text strings, which are tokenized and fed to the cross-encoder to produce relevance scores.

How can I call this model via the gigarouter API?

Use the OpenAI-compatible endpoint with your API key, sending query-passage pairs to the rerank endpoint.

not yet live

We're benchmarking and onboarding MMarco MiniLMv2 L12 H384 Reranker as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related reranker models

compare all →

ms-marco-MiniLM-L6-v2

81.5M dl/mo · live

ms-marco-MiniLM-L4-v2

4.8M dl/mo

gte-reranker-modernbert-base

2.7M dl/mo

ms-marco-MiniLM-L12-v2

2.3M dl/mo

jina-reranker-v2-base-multilingual

1.8M dl/mo · live

Qwen3-Reranker-4B

1.8M dl/mo