Granite Embedding English R2

ibm-granite/granite-embedding-english-r2

published Jul 2025 · updated Jan 2026

Granite Embedding English R2 is a embed model that generates high-quality text embeddings for retrieval, search, and similarity applications.

est. price

~$0.008

/ 1M tokens · estimated, set at launch

API providers

downloads / mo

45K

license

apache-2.0

specs

Task	Embedding
Architecture	ModernBERT bi-encoder
Parameters	149M
License	Apache 2.0

about this model

Granite-embedding-english-r2 is a 149M parameter dense bi-encoder embedding model that generates fixed-length vector representations (768 dimensions) for text inputs, enabling comparison via cosine similarity for retrieval and search applications. It supports a context length of up to 8,192 tokens and is built on the ModernBERT architecture, incorporating alternating attention lengths, rotary position embeddings, GeGLU activations, and Flash Attention 2.0 for efficiency.

The model is trained exclusively on open-source relevance-pair datasets with permissive, enterprise-friendly licenses, plus IBM-collected and generated synthetic data. It does not use the MS-MARCO dataset due to its non-commercial license. Training data was filtered to remove hate, abuse, and profanity.

Key Strengths

Strong performance across diverse retrieval domains: text (BEIR, MTEB-v2), code (CoIR), long-document search (MLDR, LongEmbed), conversational multi-turn (MTRAG), and table retrieval (NQTables, OTT-QA, AIT-QA, MultiHierTT, OpenWikiTables).
Measurable speed advantages of 19–44% over leading competitors while maintaining superior accuracy.
Training incorporates code in 9 languages (Python, Go, Java, JS, PHP, Ruby, SQL, C, C++).

Benchmark Results

The following table compares the r2 model against its predecessor and other open-source models on key benchmarks:

Model	Parameters	Embedding Size	BEIR Retrieval (15)	MTEB-v2 (41)	CoIR (10)	MLDR (En)	MTRAG (4)	Encoding Speed (docs/sec)
granite-embedding-125m-english	125M	768	52.3	62.1	50.3	35.0	49.4	149
granite-embedding-english-r2	149M	768	53.1	62.8	55.3	40.7	56.7	144
e5-base-v2	109M	768	—	—	50.3	32.5	37.0	115
bge-base-en-v1.5	109M	768	—	—	46.6	33.5	38.8	116
gte-modernbert-base	149M	768	—	—	71.5	46.2	36.8	142

Improvements over the previous r1 model include: BEIR +0.8, CoIR +5.0, MLDR +5.7, and MTRAG +7.3.

The model is released under Apache 2.0 license. For further details, see the paper and repository.

best for

·Enterprise document retrieval and search
·Code retrieval across multiple programming languages
·Long-document and conversational multi-turn retrieval

FAQ

What is Granite Embedding English R2 best used for?

It is best for dense retrieval tasks such as enterprise search, code retrieval, long-document search, table retrieval, and multi-turn conversational retrieval.

How do I call this model via the gigarouter API?

Use the OpenAI-compatible endpoint on gigarouter with your API key, specifying the model name granite-embedding-english-r2 and sending your text inputs.

What is the license for this model?

It is released under the Apache 2.0 license, allowing unrestricted research and commercial use.

What is the context length and embedding size?

It supports up to 8192 tokens and produces 768-dimensional embeddings.

Is this model a bi-encoder or cross-encoder?

It is a bi-encoder that generates separate embeddings for queries and passages, compared via cosine similarity.

not yet live

We're benchmarking and onboarding Granite Embedding English R2 as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related embeddings models

compare all →

nomic-embed-text-v1.5