CDE Small V2

jxm/cde-small-v2

published Jan 2025 · updated May 2025

CDE Small V2 is a text embedding model that naturally integrates corpus context tokens into a two-stage process to produce contextualized document embeddings for retrieval.

est. price

~$0.008

/ 1M tokens · estimated, set at launch

API providers

downloads / mo

25.9K

specs

Task	Text Embedding
Architecture	Two-stage contextual biencoder
Effective Parameters	140M (281M total, includes first-stage weights)
Max Sequence Length	512 tokens
MTEB Average Score	65.58 (best under 400M params)

about this model

jxm/cde-small-v2 is a text embedding model that uses contextual document embeddings (CDE) to generate dense vector representations for search and retrieval tasks. Unlike conventional biencoders, it conditions each embedding on a sampled subset of the target corpus (the "minicorpus"), improving retrieval accuracy by incorporating dataset-level context into the representation.

The model operates in two stages. First, a "first-stage" model encodes a fixed set of 512 representative documents from the corpus to produce dataset embeddings. Second, a "second-stage" model encodes individual queries and documents while attending to these dataset embeddings, enabling context-aware representations suitable for similarity search. The effective parameter count is 140M (the 281M reported by Hugging Face includes both stage weights).

As of January 2025, cde-small-v2 achieves a score of 65.58 on the MTEB leaderboard, making it the highest-performing small embedding model under 400M parameters. When corpus information is unavailable, fallback random strings can be used, with performance dropping to 63.8. Task-specific prefixes are required: search_query: for queries and search_document: for documents.

Key references

This model is hosted on gigarouter as a managed, OpenAI-compatible API. No local installation is required; users call the API endpoint with their input text and receive normalized embeddings suitable for dot-product comparison.

best for

·Large-scale document retrieval with contextual understanding
·Question answering over a corpus
·Semantic search where neighboring documents provide context

FAQ

What does CDE stand for?

Contextual Document Embeddings.

How does CDE Small V2 differ from standard embedding models?

It uses a two-stage process: first embedding a sample of the corpus to capture context, then conditioning queries and documents on that context. This improves retrieval, especially out-of-domain.

What are the required prefixes for queries and documents?

Use "search_query: " for queries and "search_document: " for documents.

How many documents are needed for the first-stage corpus sample?

Exactly 512 documents (the transductive corpus size).

How can I call CDE Small V2 via the gigarouter API?

Use the OpenAI-compatible endpoint with your API key and the model name "cde-small-v2". Pass the first-stage embeddings as needed.

not yet live

We're benchmarking and onboarding CDE Small V2 as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related embeddings models

compare all →

nomic-embed-text-v1.5