skip to content
gigarouter gigarouter
models / embeddings · coming soon

Nomic Embed V2 MoE

nomic-ai/nomic-embed-text-v2-moe

published Feb 2025 · updated Apr 2025

Nomic Embed V2 MoE is a multilingual mixture-of-experts text embedding model optimized for retrieval, supporting flexible embedding dimensions via Matryoshka representation learning.

est. price
~$0.008
/ 1M tokens · estimated, set at launch
API providers
0
downloads / mo
854.7K
license
apache-2.0

specs

TaskText Embedding & Retrieval
ArchitectureMixture of Experts (MoE) with 8 experts, top-2 routing
Parameters475M total, 305M active
Embedding Dimension768 (flexible down to 256 via Matryoshka)
Max Sequence Length512 tokens

about this model

nomic-embed-text-v2-moe is a multilingual Mixture of Experts (MoE) text embedding model that produces high-quality vector representations for retrieval and search tasks. It is the first general-purpose MoE text embedding model, designed to deliver state-of-the-art multilingual performance while maintaining inference efficiency through sparse expert routing.

Architecture and Efficiency

The model uses a MoE architecture with 8 experts and top-2 routing, totaling 475M parameters but activating only 305M during inference. This design reduces inference latency and memory usage compared to dense models of equivalent capability, addressing deployment challenges in retrieval-augmented generation (RAG) applications. The model supports flexible embedding dimensions from 768 down to 256 through Matryoshka Representation Learning, enabling up to 3x reductions in storage cost with minimal performance degradation. Maximum sequence length is 512 tokens.

Multilingual Performance

Trained on over 1.6 billion high-quality pairs across approximately 100 languages, the model achieves state-of-the-art multilingual results against models in the ~300M parameter class and remains competitive with models twice its size. Key benchmark results include:

ModelParams (M)Emb DimBEIRMIRACL
Nomic Embed v230576852.8665.80
mE5 Base27876848.8862.30
mGTE Base30576851.1063.40
Arctic Embed v2 Base30576855.4059.90

For larger models in the 560-568M parameter range, nomic-embed-text-v2-moe (305M active) achieves a BEIR score of 52.86 and MIRACL score of 65.80, competitive with models twice its size.

Matryoshka Embeddings and Flexibility

The model supports Matryoshka Representation Learning, enabling flexible embedding dimensions from 768 down to 256. This allows developers to trade off between embedding size and accuracy: truncating to 256 dimensions can reduce storage costs by 3x with minimal performance degradation, as shown in the figure below.

Bar chart comparing BEIR performance at 768 dimensions versus 256 dimensions, showing minimal accuracy loss at the reduced size

Multilingual Capabilities

Trained on over 1.6 billion high-quality pairs across approximately 100 languages, the model supports dozens of languages for retrieval tasks. Performance on BEIR and MIRACL benchmarks is shown below relative to other open-weight embedding models.

Bar chart comparing nomic-embed-text-v2-moe BEIR and MIRACL scores against mE5 Base, mGTE Base, Arctic Embed v2 Base, BGE M3, Arctic Embed v2 Large, and mE5 Large

Training and Data

The model was trained on over 1.6 billion high-quality pairs across multiple languages using consistency filtering, with both weakly-supervised contrastive pretraining and supervised finetuning. The training pipeline and all model weights are fully open-source. For further details, see the blog post and technical report.

best for

FAQ

What is the difference between total and active parameters?

Total parameters are 475M, but only 305M are active per forward pass due to MoE sparsity (top-2 expert routing), enabling faster inference.

What embedding dimensions are supported?

The model outputs 768-dimensional embeddings by default, but can be truncated to any dimension from 768 down to 256 with minimal performance loss (Matryoshka representation learning).

What prefixes are required for input text?

Use "search_query: " before queries and "search_document: " before documents. This task instruction is mandatory.

How do I call this model via the gigarouter API?

Use the OpenAI-compatible endpoint with your gigarouter API key, sending the input with the required prefix.

How many languages does the model support?

It supports approximately 100 languages, trained on over 1.6 billion multilingual pairs.

not yet live

We're benchmarking and onboarding Nomic Embed V2 MoE as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related embeddings models

compare all →