skip to content
gigarouter gigarouter
models / embeddings · coming soon

GTE Base EN v1.5

Alibaba-NLP/gte-base-en-v1.5

published Apr 2024 · updated Nov 2024

GTE Base EN v1.5 is an English text embedding model that supports up to 8192 tokens and achieves state-of-the-art MTEB scores in its size category.

est. price
~$0.008
/ 1M tokens · estimated, set at launch
API providers
0
downloads / mo
459.5K
license
apache-2.0

specs

TaskText Embeddings
ArchitectureTransformer++ (BERT + RoPE + GLU)
Parameters137M
LicenseApache 2.0

about this model

Alibaba-NLP/gte-base-en-v1.5 is an English text embedding model that encodes input texts into dense vector representations, supporting a maximum sequence length of 8192 tokens.

The model is built on a transformer++ encoder backbone combining BERT with Rotary Position Embedding (RoPE) and Gated Linear Units (GLU). It employs attention dropout of 0 for compatibility with xformers and flash attention, and uses unpadding to eliminate computation on padding tokens, improving inference efficiency.

On the MTEB benchmark (56 tasks), gte-base-en-v1.5 achieves an average score of 64.11, competitive with larger models: bge-base-en-v1.5 (109M parameters) scores 63.55, while mxbai-embed-large-v1 (335M) scores 64.68. The model also performs well on long-context retrieval, scoring 87.44 average across five LoCo tasks, matching or exceeding larger models.

ModelMTEB Average (56)LoCo Average (5)
gte-base-en-v1.5 (137M)64.1187.44
bge-base-en-v1.5 (109M)63.55
gte-large-en-v1.5 (434M)65.3986.71

Licensed under Apache 2.0, this model is hosted by gigarouter as a managed API, requiring no local setup.

best for

FAQ

What is the maximum input length for GTE Base EN v1.5?

The model supports up to 8192 tokens.

What is the output embedding dimension?

The output dimension is 768.

How does this model compare to bge-base-en-v1.5 in MTEB score?

GTE Base EN v1.5 scores 64.11 on MTEB (56 tasks), outperforming bge-base-en-v1.5 which scores 63.55.

What license is this model released under?

Apache 2.0.

How can I call this model via the gigarouter API?

Use the gigarouter OpenAI-compatible endpoint with your API key, sending a POST request with the input text and model name.

not yet live

We're benchmarking and onboarding GTE Base EN v1.5 as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related embeddings models

compare all →