Stella EN 400M V5

NovaSearch/stella_en_400M_v5

published Jul 2024 · updated Jul 2025

Stella EN 400M V5 is an English text embedding model that uses Matryoshka Representation Learning to support multiple output dimensions from 512 to 8192.

est. price

~$0.008

/ 1M tokens · estimated, set at launch

API providers

downloads / mo

69.4K

license

mit

specs

Task	Text embedding (sentence-to-passage and sentence-to-sentence)
Architecture	Transformer (based on Alibaba-NLP/gte-large-en-v1.5)
Parameters	400M
License	MIT

about this model

Stella en 400M v5 is an embedding model that converts text into dense vectors for retrieval and semantic similarity tasks. Built on Alibaba-NLP’s GTE-Large and GTE-Qwen2-1.5B-instruct, it uses Matryoshka Representation Learning (MRL) to support multiple output dimensions: 512, 768, 1024, 2048, 4096, 6144, and 8192. The 1024‑dimensional variant achieves an MTEB score within 0.001 of the 8192‑d output, making it a practical default for most applications.

The model is optimized for sentence‑to‑passage (s2p) and sentence‑to‑sentence (s2s) tasks via dedicated prompt templates. For retrieval use cases, the s2p prompt is recommended; for semantic textual similarity, the s2s prompt. Documents do not require any prompt.

Hosted as a managed API on gigarouter, the model can be called via a single HTTP endpoint without local installation or dependency management. The underlying architecture delivers strong performance in RAG pipelines and FAQ retrieval, with a sequence length of 512 tokens. For further details, refer to the Jasper and Stella technical report and the RAG‑Retrieval code repository.

best for

·Dense retrieval for RAG pipelines
·Semantic textual similarity
·FAQ matching and question answering

FAQ

What output dimensions does this model support?

It supports 512, 768, 1024, 2048, 4096, 6144, and 8192 dimensions via Matryoshka Representation Learning; 1024 is the default and recommended.

What prompts should I use for retrieval vs. similarity tasks?

For sentence-to-passage (retrieval) use the s2p_query prompt; for sentence-to-sentence (similarity) use the s2s_query prompt. Documents need no prompt.

What is the recommended sequence length?

512 tokens is recommended; the model was trained on sequences of length 512.

How can I call this model via the gigarouter API?

Use the gigarouter OpenAI-compatible endpoint with your API key, specifying the model name and your input text.

What license is this model released under?

MIT license.

not yet live

We're benchmarking and onboarding Stella EN 400M V5 as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related embeddings models

compare all →

nomic-embed-text-v1.5