skip to content
gigarouter gigarouter
models / embeddings · coming soon

Arctic Embed M V2.0

Snowflake/snowflake-arctic-embed-m-v2.0

published Nov 2024 · updated Apr 2025

Arctic Embed M V2.0 is a multilingual embedding model that delivers high-quality retrieval across English and other languages without compromise, supporting long context and efficient compression.

est. price
~$0.008
/ 1M tokens · estimated, set at launch
API providers
0
downloads / mo
162.9K
license
apache-2.0

specs

TaskText embedding and retrieval
ArchitectureTransformer (based on GTE-multilingual-base)
Parameters305M total (113M non-embedding)
Dimensions768 (reducible via MRL to 256)
LicenseApache 2.0

about this model

Snowflake/snowflake-arctic-embed-m-v2.0 is a multilingual text embedding model optimized for retrieval, supporting Matryoshka Representation Learning (MRL) and a context window of up to 8192 tokens via RoPE. It is designed for enterprise-grade multilingual search and retrieval at scale, delivering competitive performance on both English and non-English benchmarks without compromising on either.

Key Strengths

  • Multilingual without compromise: Excels across English and non-English retrieval, outperforming leading open-source and proprietary models on MTEB Retrieval, CLEF, and MIRACL benchmarks.
  • Inference efficiency: With 113 million non-embedding parameters, the model is fast and efficient at any scale.
  • Compression-friendly: Supports Matryoshka Representation Learning (MRL) to reduce vector dimensions from 768 to 256 with minimal quality degradation, and 4-bit quantization for high-quality retrieval at 128 bytes per vector (e.g., using a pq256x4fs fast-scan FAISS index).
  • Long context support: Built on GTE-multilingual-base, supporting up to 8192 tokens via RoPE.

Benchmark Performance

Average NDCG@10 across key retrieval benchmarks:

Benchmark Score
BEIR (15 datasets)55.4
MIRACL (4 languages)55.2
CLEF (Focused)51.7
CLEF (Full)53.9

These results place arctic-embed-m-v2.0 ahead of comparable models such as bge-m3, me5-base, and gte-multilingual-base on BEIR and CLEF, while remaining competitive on MIRACL.

Compression Efficiency

Using MRL to truncate embeddings from 768 to 256 dimensions reduces vector size by 3x with approximately 2-3% degradation in NDCG@10. Combining MRL with 4-bit quantization enables high-quality retrieval at 128 bytes per vector.

Dimensions BEIR (15) MIRACL (4) CLEF (Focused) CLEF (Full)
76855.455.251.753.9
25654.454.050.652.3

Relative performance drop at 256 dimensions is approximately 2-3% across benchmarks.

Comparison with Alternatives

Model Non-emb params BEIR (15) MIRACL (4) CLEF (Focused) CLEF (Full)
snowflake-arctic-m-v2.0113M55.455.251.753.9
snowflake-arctic-m86M54.924.934.429.1
me5 base303M51.454.043.034.6
bge-m3 (BAAI)303M48.856.840.841.3
gte (Alibaba)113M51.152.347.753.1

Released under the Apache 2.0 license. A technical report detailing the training methodology is available at arXiv:2412.04506.

best for

FAQ

What is Arctic Embed M V2.0 best used for?

It is designed for high-quality multilingual text retrieval, excelling on benchmarks like MTEB Retrieval, MIRACL, and CLEF while maintaining strong English performance.

How does its size and speed compare to other embedding models?

With 113M non-embedding parameters, it is faster and more efficient than larger models like BGE-M3 or me5 base (both ~303M non-emb), and supports a context window of 8192 tokens.

What license is the model released under?

Apache 2.0, permitting free commercial use without restrictions.

What are the input and output formats for the model?

Input: text strings. Output: 768-dimensional embeddings (can be reduced to 256 via MRL). For queries, prepend "query: "; for documents, no prefix.

How can I call this model via the gigarouter API?

Use the OpenAI-compatible endpoint with your API key and the model ID "snowflake-arctic-embed-m-v2.0". Send a POST request with the input text.

not yet live

We're benchmarking and onboarding Arctic Embed M V2.0 as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related embeddings models

compare all →