CDE Small V1
jxm/cde-small-v1
published Sep 2024 · updated May 2025
CDE Small V1 is a contextual document embedding model that integrates context tokens for improved retrieval, achieving state-of-the-art on MTEB among models under 400M parameters.
specs
| Task | Text Embedding for Retrieval |
| Architecture | BERT-based two-stage contextual architecture |
| Parameters | Under 400 million |
| License | Not specified |
about this model
cde-small-v1 is a text embedding model that integrates context tokens into the embedding process, using a two-stage architecture to condition document and query embeddings on a representative subset of the corpus. The first stage embeds a fixed number of representative documents (512 by default) to produce dataset embeddings. The second stage then embeds individual queries or documents while conditioning on these dataset embeddings, enabling contextualized representations that outperform standard biencoders.
Key Strengths
As of October 1, 2024, cde-small-v1 is the best small model (under 400 million parameters) on the MTEB leaderboard for text embedding, achieving an average score of 65.00. Without corpus information, performance drops to 63.8, still competitive. The model achieves state-of-the-art MTEB results without hard negative mining, score distillation, dataset-specific instructions, intra-GPU example-sharing, or extremely large batch sizes, as detailed in the accompanying paper (arXiv:2410.02525).
Usage Notes
The model uses task-specific prefixes: search_query: for queries and search_document: for documents. Embeddings are normalized and should be compared via dot product or cosine similarity.
best for
- ·Semantic search and retrieval
- ·Document clustering
- ·Question answering retrieval
FAQ
It is optimized for dense retrieval tasks where contextual document embeddings improve performance, especially out-of-domain.
Yes, the model is deprecated; the improved CDE Small V2 is recommended with higher MTEB score (65.58).
First, embed a subset of the corpus to create dataset embeddings; second, embed queries and documents conditioned on those context embeddings.
Use the OpenAI-compatible endpoint with an API key, providing prompts with the appropriate prefixes for queries and documents.
Text up to 512 tokens; use prefixes "search_query: " for queries and "search_document: " for documents.
We're benchmarking and onboarding CDE Small V1 as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.