skip to content
gigarouter gigarouter
models / embeddings · coming soon

MMLW E5 Base

sdadas/mmlw-e5-base

published Nov 2023 · updated Feb 2026

MMLW E5 Base is a Polish text embedding model that transforms texts into 768-dimensional vectors for tasks like semantic similarity, clustering, and information retrieval.

est. price
~$0.008
/ 1M tokens · estimated, set at launch
API providers
0
downloads / mo
369
license
apache-2.0

specs

TaskText Embedding
ArchitectureDistilled from multilingual E5 with BGE teacher
ParametersNot specified
LicenseMIT (teacher model)

about this model

MMLW-e5-base is a Polish neural text encoder that transforms texts into 768-dimensional embeddings for tasks such as semantic similarity, clustering, and information retrieval. It is a distilled model initialized from the multilingual E5 checkpoint and further trained using multilingual knowledge distillation on 60 million Polish-English text pairs, with BAAI/bge-base-en as the teacher model. The distillation method follows the approach described in Reimers & Gurevych (EMNLP 2020).

The model requires specific prefixes: queries must be prefixed with "query: " and passages with "passage: ". It can also serve as a base for further fine-tuning.

Benchmark Results

The model was trained with A100 GPU cluster support from the TASK center at Gdansk University of Technology.

best for

FAQ

What is the output dimension of MMLW E5 Base?

It outputs 768-dimensional vectors.

What prefixes are required when encoding queries and passages?

Queries must be prefixed with "query: " and passages with "passage: ".

What is the average score on the Polish MTEB benchmark?

It achieves an average score of 59.71 on the Polish MTEB.

What license applies to this model?

The teacher model BAAI/bge-base-en is released under the MIT license.

How can I call this model via the gigarouter API?

Use the gigarouter OpenAI-compatible endpoint with an API key, following the required query/passage prefixes.

not yet live

We're benchmarking and onboarding MMLW E5 Base as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related embeddings models

compare all →