CDE Small V2
jxm/cde-small-v2
published Jan 2025 · updated May 2025
CDE Small V2 is a text embedding model that naturally integrates corpus context tokens into a two-stage process to produce contextualized document embeddings for retrieval.
specs
| Task | Text Embedding |
| Architecture | Two-stage contextual biencoder |
| Effective Parameters | 140M (281M total, includes first-stage weights) |
| Max Sequence Length | 512 tokens |
| MTEB Average Score | 65.58 (best under 400M params) |
about this model
jxm/cde-small-v2 is a text embedding model that uses contextual document embeddings (CDE) to generate dense vector representations for search and retrieval tasks. Unlike conventional biencoders, it conditions each embedding on a sampled subset of the target corpus (the "minicorpus"), improving retrieval accuracy by incorporating dataset-level context into the representation.
The model operates in two stages. First, a "first-stage" model encodes a fixed set of 512 representative documents from the corpus to produce dataset embeddings. Second, a "second-stage" model encodes individual queries and documents while attending to these dataset embeddings, enabling context-aware representations suitable for similarity search. The effective parameter count is 140M (the 281M reported by Hugging Face includes both stage weights).
As of January 2025, cde-small-v2 achieves a score of 65.58 on the MTEB leaderboard, making it the highest-performing small embedding model under 400M parameters. When corpus information is unavailable, fallback random strings can be used, with performance dropping to 63.8. Task-specific prefixes are required: search_query: for queries and search_document: for documents.
Key references
This model is hosted on gigarouter as a managed, OpenAI-compatible API. No local installation is required; users call the API endpoint with their input text and receive normalized embeddings suitable for dot-product comparison.
best for
- ·Large-scale document retrieval with contextual understanding
- ·Question answering over a corpus
- ·Semantic search where neighboring documents provide context
FAQ
Contextual Document Embeddings.
It uses a two-stage process: first embedding a sample of the corpus to capture context, then conditioning queries and documents on that context. This improves retrieval, especially out-of-domain.
Use "search_query: " for queries and "search_document: " for documents.
Exactly 512 documents (the transductive corpus size).
Use the OpenAI-compatible endpoint with your API key and the model name "cde-small-v2". Pass the first-stage embeddings as needed.
We're benchmarking and onboarding CDE Small V2 as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.