LLM2Vec Mistral 7B Instruct v2
McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp-unsup-simcse
published Apr 2024 · updated Apr 2024
LLM2Vec Mistral 7B Instruct v2 is an embed model that converts a decoder-only LLM into a powerful text encoder using bidirectional attention and unsupervised contrastive learning.
specs
| Task | Text Embedding |
| Architecture | Mistral 7B decoder-only with bidirectional attention |
| Parameters | 7B |
| License | MIT |
| Max Sequence Length | 512 tokens |
about this model
LLM2Vec-Mistral-7B-Instruct-v2-mntp-unsup-simcse is an embedding model that converts a decoder-only large language model into a text encoder for generating dense vector representations of text. Built on the LLM2Vec recipe, it transforms the base Mistral 7B Instruct v2 model through three steps: enabling bidirectional attention, masked next token prediction (MNTP), and unsupervised contrastive learning (SimCSE). The result is a universal text encoder that operates without requiring supervised data.
Key strengths
The model leverages the representational power of a 7B-parameter decoder-only LLM while adding bidirectional context, making it effective for semantic similarity, retrieval, and clustering tasks. The unsupervised SimCSE training aligns embeddings without labelled pairs, preserving the model's general language understanding. The approach is parameter-efficient, using LoRA adapters that are merged into the base model.
Benchmark performance
On the Massive Text Embeddings Benchmark (MTEB), LLM2Vec applied to Mistral 7B achieves a new unsupervised state-of-the-art performance. The model also outperforms encoder-only architectures on word-level tasks by a large margin, as demonstrated in the LLM2Vec paper (arXiv:2404.05961). These results confirm that decoder-only LLMs can be effectively repurposed as strong text encoders without expensive adaptation or synthetic data.
Embedding methodology
The model supports instruction-based encoding for queries and plain encoding for documents, with mean pooling and a maximum sequence length of 512 tokens. It is designed for use cases such as information retrieval, text classification, and semantic search.
Hosted on Gigarouter, this model is available as an OpenAI-compatible API, eliminating the need for local infrastructure or manual model loading.
best for
- ·Semantic search
- ·Document retrieval
- ·Text clustering
FAQ
It is an unsupervised text embedding model based on Mistral 7B, trained with the LLM2Vec recipe that enables bidirectional attention and masked next token prediction.
It achieves state-of-the-art unsupervised performance on MTEB, outperforming encoder-only models on word-level tasks.
For queries, pass an instruction followed by the query string; for documents, pass only the document text. Encode with max 512 tokens.
Use gigarouter's OpenAI-compatible endpoint with a valid API key to send embedding requests.
We're benchmarking and onboarding LLM2Vec Mistral 7B Instruct v2 as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.