GTE Multilingual Base

Alibaba-NLP/gte-multilingual-base

published Jul 2024 · updated Jul 2025

GTE Multilingual Base is an encoder-only transformer embedding model that generates dense and sparse vectors for multilingual text retrieval and representation tasks.

est. price

~$0.008

/ 1M tokens · estimated, set at launch

API providers

downloads / mo

1.2M

license

apache-2.0

specs

Task	Text Embeddings
Architecture	Encoder-only transformer
Parameters	305M
Max Input Tokens	8192
Embedding Dimension	768 (elastic from 128 to 768)
Languages	70+

about this model

gte-multilingual-base is an encoder-only multilingual text embedding model that generates dense and sparse vector representations for input text, supporting up to 8,192 tokens and over 70 languages.

Architecture and Capabilities

Trained with an encoder-only transformer architecture, the model delivers a 10x increase in inference speed compared to decode-only LLM-based embedding models of similar size. It supports elastic dense embeddings, allowing the output dimension to be adjusted between 128 and 768 without sacrificing downstream effectiveness, and can also produce sparse token vectors for hybrid retrieval. With 305 million parameters and a native 8,192-token context, it is designed for long-document and multilingual retrieval tasks.

Benchmark Performance

According to the accompanying paper, the text encoder outperforms the same-sized XLM-R and matches the performance of the large-sized BGE-M3 model, achieving better results on long-context retrieval benchmarks. The model achieves state-of-the-art results on multilingual retrieval datasets including MIRACL and MLDR, cross-lingual retrieval on MKQA, and English retrieval on BEIR and LoCo. On the MTEB leaderboard it demonstrates strong performance across English, Chinese, French, and Polish tasks.

Bar chart showing retrieval results on MIRACL, MLDR, MKQA, BEIR, and LoCo benchmarks for gte-multilingual-base compared to other models.

Table of MTEB scores for English, Chinese, French, and Polish tasks comparing gte-multilingual-base to baseline models.

Research Background

The model is introduced in the paper mGTE: Generalized Long-Context Text Representation and Reranking Models for Multilingual Text Retrieval (EMNLP 2024 Industry Track). It is part of the GTE family and has been downloaded over 1.2 million times, reflecting its adoption in production and research environments.

best for

·Multilingual semantic search across 70+ languages
·Long-document retrieval with up to 8192 tokens
·Hybrid dense-sparse embedding for efficient storage and retrieval

FAQ

What is the maximum input length for this model?

The model supports up to 8192 tokens.

What languages does the model support?

It supports over 70 languages, including both high-resource and low-resource languages.

How does this model compare to other multilingual embedding models?

It outperforms the same-sized XLM-R and matches the performance of large BGE-M3 models while being 10x faster than decoder-based alternatives.

What embedding dimensions are available?

The default dimension is 768, but you can use any dimension from 128 to 768 via elastic dense embedding.

How can I call this model via the gigarouter API?

Use the gigarouter OpenAI-compatible endpoint with your API key, sending input texts and specifying the model name.

not yet live

We're benchmarking and onboarding GTE Multilingual Base as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related embeddings models

compare all →

nomic-embed-text-v1.5