skip to content
gigarouter gigarouter
models / embeddings · coming soon

GTE Multilingual Base

Alibaba-NLP/gte-multilingual-base

published Jul 2024 · updated Jul 2025

GTE Multilingual Base is an encoder-only transformer embedding model that generates dense and sparse vectors for multilingual text retrieval and representation tasks.

est. price
~$0.008
/ 1M tokens · estimated, set at launch
API providers
0
downloads / mo
1.2M
license
apache-2.0

specs

TaskText Embeddings
ArchitectureEncoder-only transformer
Parameters305M
Max Input Tokens8192
Embedding Dimension768 (elastic from 128 to 768)
Languages70+

about this model

gte-multilingual-base is an encoder-only multilingual text embedding model that generates dense and sparse vector representations for input text, supporting up to 8,192 tokens and over 70 languages.

Architecture and Capabilities

Trained with an encoder-only transformer architecture, the model delivers a 10x increase in inference speed compared to decode-only LLM-based embedding models of similar size. It supports elastic dense embeddings, allowing the output dimension to be adjusted between 128 and 768 without sacrificing downstream effectiveness, and can also produce sparse token vectors for hybrid retrieval. With 305 million parameters and a native 8,192-token context, it is designed for long-document and multilingual retrieval tasks.

Benchmark Performance

According to the accompanying paper, the text encoder outperforms the same-sized XLM-R and matches the performance of the large-sized BGE-M3 model, achieving better results on long-context retrieval benchmarks. The model achieves state-of-the-art results on multilingual retrieval datasets including MIRACL and MLDR, cross-lingual retrieval on MKQA, and English retrieval on BEIR and LoCo. On the MTEB leaderboard it demonstrates strong performance across English, Chinese, French, and Polish tasks.

Bar chart showing retrieval results on MIRACL, MLDR, MKQA, BEIR, and LoCo benchmarks for gte-multilingual-base compared to other models. Table of MTEB scores for English, Chinese, French, and Polish tasks comparing gte-multilingual-base to baseline models.

Research Background

The model is introduced in the paper mGTE: Generalized Long-Context Text Representation and Reranking Models for Multilingual Text Retrieval (EMNLP 2024 Industry Track). It is part of the GTE family and has been downloaded over 1.2 million times, reflecting its adoption in production and research environments.

best for

FAQ

What is the maximum input length for this model?

The model supports up to 8192 tokens.

What languages does the model support?

It supports over 70 languages, including both high-resource and low-resource languages.

How does this model compare to other multilingual embedding models?

It outperforms the same-sized XLM-R and matches the performance of large BGE-M3 models while being 10x faster than decoder-based alternatives.

What embedding dimensions are available?

The default dimension is 768, but you can use any dimension from 128 to 768 via elastic dense embedding.

How can I call this model via the gigarouter API?

Use the gigarouter OpenAI-compatible endpoint with your API key, sending input texts and specifying the model name.

not yet live

We're benchmarking and onboarding GTE Multilingual Base as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related embeddings models

compare all →