skip to content
gigarouter gigarouter
models / embeddings · coming soon

SFR Embedding Mistral

Salesforce/SFR-Embedding-Mistral

published Jan 2024 · updated Feb 2025

SFR Embedding Mistral is a text embedding model fine-tuned from Mistral 7B for high-performance retrieval, clustering, and semantic similarity tasks.

est. price
~$0.008
/ 1M tokens · estimated, set at launch
API providers
0
downloads / mo
18.2K
license
cc-by-nc-4.0

specs

TaskText Embedding
ArchitectureMistral-7B-v0.1 with E5-mistral-7b-instruct fine-tune + LoRA adapters (rank 8)
Parameters7B (21M trainable via LoRA)
LicenseResearch purposes only
Max Sequence Length4096 tokens

about this model

SFR-Embedding-Mistral is a text embedding model that converts text into high-dimensional vector representations for retrieval, clustering, classification, and semantic similarity tasks. Built on top of E5-mistral-7b-instruct and Mistral-7B-v0.1, it is designed to produce state-of-the-art embeddings for information retrieval and related applications.

Key Strengths

The model achieves a top average score of 67.6 across 56 datasets on the MTEB benchmark, the most comprehensive text embedding evaluation available. On the BEIR retrieval benchmark, it attains a retrieval score of 59.0, a significant improvement over the 56.9 score of its predecessor E5-mistral-7b-instruct. It also demonstrates a +1.4 absolute improvement in clustering tasks over the same baseline.

Training and Architecture

SFR-Embedding-Mistral is fine-tuned from E5-mistral-7b-instruct and Mistral-7B-v0.1. Training uses a batch size of 2048, a learning rate of 1e-5, 100-step linear warmup, and 7 hard negatives per query-document pair. LoRA adapters (rank r=8) are applied to all linear layers, yielding 21 million trainable parameters. Training completes in approximately 15 hours on 8 A100 GPUs. Task-homogeneous batching further improves retrieval dev scores by 0.8 points and STS dev scores by 0.49 points.

Usage Pattern

Each query must be prefixed with a one-sentence instruction describing the task (e.g., "Given a web search query, retrieve relevant passages that answer the query"). Document passages do not require an instruction prefix. The model supports a maximum sequence length of 4096 tokens and uses last-token pooling to generate embeddings.

Benchmark Performance

BenchmarkMetricScore
MTEB (56 datasets)Average67.6
BEIR RetrievalRetrieval score59.0
Clustering (MTEB)Absolute improvement vs. E5-mistral-7b-instruct+1.4

Training Details

The model was fine-tuned with a batch size of 2048, learning rate 1e-5, and 7 hard negatives per query-document pair. LoRA adapters (rank r=8) were added to all linear layers, resulting in 21 million trainable parameters. Training took approximately 15 hours on 8 A100 GPUs. Task-homogeneous batching was used to improve performance, yielding a 0.8-point gain on retrieval dev scores and a 0.49-point gain on STS dev scores.

Usage Notes

Each query must be formatted with a one-sentence instruction describing the task, while document passages require no instruction prefix. The model supports a maximum sequence length of 4096 tokens and uses last-token pooling to generate normalized embeddings.

Ethical Considerations

This model is released for research purposes. It has not been specifically evaluated for all downstream applications. Users should assess accuracy, safety, and fairness before deployment, particularly in high-risk scenarios. Refer to Salesforce's Acceptable Use Policy for further guidance.

best for

FAQ

What is SFR Embedding Mistral best used for?

It excels at text retrieval tasks like web search, document ranking, and semantic similarity, achieving state-of-the-art results on the MTEB benchmark.

How does it compare to E5-mistral-7b-instruct?

It significantly improves retrieval performance, scoring 59.0 on the BEIR benchmark versus 56.9 for E5-mistral-7b-instruct, and adds +1.4 on clustering tasks.

What is the input format for this model?

Queries require a one-sentence instruction prefix (e.g., 'Instruct: Given a web search query, retrieve relevant passages\nQuery: ...'). Documents do not need an instruction. The model uses last-token pooling and normalization.

How can I call SFR Embedding Mistral via the gigarouter API?

Use the OpenAI-compatible endpoint with an API key. Send a POST request with the model name and input text; the API returns normalized embedding vectors.

What license applies to this model?

The model is released for research purposes only. Base components (Mistral-7B) are Apache 2.0, but the derived SFR-Embedding-Mistral has no open-source license — it is intended for academic use.

not yet live

We're benchmarking and onboarding SFR Embedding Mistral as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related embeddings models

compare all →