SFR Embedding Mistral
Salesforce/SFR-Embedding-Mistral
published Jan 2024 · updated Feb 2025
SFR Embedding Mistral is a text embedding model fine-tuned from Mistral 7B for high-performance retrieval, clustering, and semantic similarity tasks.
specs
| Task | Text Embedding |
| Architecture | Mistral-7B-v0.1 with E5-mistral-7b-instruct fine-tune + LoRA adapters (rank 8) |
| Parameters | 7B (21M trainable via LoRA) |
| License | Research purposes only |
| Max Sequence Length | 4096 tokens |
about this model
SFR-Embedding-Mistral is a text embedding model that converts text into high-dimensional vector representations for retrieval, clustering, classification, and semantic similarity tasks. Built on top of E5-mistral-7b-instruct and Mistral-7B-v0.1, it is designed to produce state-of-the-art embeddings for information retrieval and related applications.
Key Strengths
The model achieves a top average score of 67.6 across 56 datasets on the MTEB benchmark, the most comprehensive text embedding evaluation available. On the BEIR retrieval benchmark, it attains a retrieval score of 59.0, a significant improvement over the 56.9 score of its predecessor E5-mistral-7b-instruct. It also demonstrates a +1.4 absolute improvement in clustering tasks over the same baseline.
Training and Architecture
SFR-Embedding-Mistral is fine-tuned from E5-mistral-7b-instruct and Mistral-7B-v0.1. Training uses a batch size of 2048, a learning rate of 1e-5, 100-step linear warmup, and 7 hard negatives per query-document pair. LoRA adapters (rank r=8) are applied to all linear layers, yielding 21 million trainable parameters. Training completes in approximately 15 hours on 8 A100 GPUs. Task-homogeneous batching further improves retrieval dev scores by 0.8 points and STS dev scores by 0.49 points.
Usage Pattern
Each query must be prefixed with a one-sentence instruction describing the task (e.g., "Given a web search query, retrieve relevant passages that answer the query"). Document passages do not require an instruction prefix. The model supports a maximum sequence length of 4096 tokens and uses last-token pooling to generate embeddings.
Benchmark Performance
| Benchmark | Metric | Score |
|---|---|---|
| MTEB (56 datasets) | Average | 67.6 |
| BEIR Retrieval | Retrieval score | 59.0 |
| Clustering (MTEB) | Absolute improvement vs. E5-mistral-7b-instruct | +1.4 |
Training Details
The model was fine-tuned with a batch size of 2048, learning rate 1e-5, and 7 hard negatives per query-document pair. LoRA adapters (rank r=8) were added to all linear layers, resulting in 21 million trainable parameters. Training took approximately 15 hours on 8 A100 GPUs. Task-homogeneous batching was used to improve performance, yielding a 0.8-point gain on retrieval dev scores and a 0.49-point gain on STS dev scores.
Usage Notes
Each query must be formatted with a one-sentence instruction describing the task, while document passages require no instruction prefix. The model supports a maximum sequence length of 4096 tokens and uses last-token pooling to generate normalized embeddings.
Ethical Considerations
This model is released for research purposes. It has not been specifically evaluated for all downstream applications. Users should assess accuracy, safety, and fairness before deployment, particularly in high-risk scenarios. Refer to Salesforce's Acceptable Use Policy for further guidance.
best for
- ·Information retrieval from large document collections
- ·Semantic textual similarity (STS) and reranking
- ·Clustering and topic classification of text
- ·Web search query to passage matching
FAQ
It excels at text retrieval tasks like web search, document ranking, and semantic similarity, achieving state-of-the-art results on the MTEB benchmark.
It significantly improves retrieval performance, scoring 59.0 on the BEIR benchmark versus 56.9 for E5-mistral-7b-instruct, and adds +1.4 on clustering tasks.
Queries require a one-sentence instruction prefix (e.g., 'Instruct: Given a web search query, retrieve relevant passages\nQuery: ...'). Documents do not need an instruction. The model uses last-token pooling and normalization.
Use the OpenAI-compatible endpoint with an API key. Send a POST request with the model name and input text; the API returns normalized embedding vectors.
The model is released for research purposes only. Base components (Mistral-7B) are Apache 2.0, but the derived SFR-Embedding-Mistral has no open-source license — it is intended for academic use.
We're benchmarking and onboarding SFR Embedding Mistral as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.