skip to content
gigarouter gigarouter
models / embeddings · coming soon

LLM2Vec Mistral 7B Instruct v2

McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp-unsup-simcse

published Apr 2024 · updated Apr 2024

LLM2Vec Mistral 7B Instruct v2 is an embed model that converts a decoder-only LLM into a powerful text encoder using bidirectional attention and unsupervised contrastive learning.

status
coming soon
API providers
0
downloads / mo
24.7K
license
mit

specs

TaskText Embedding
ArchitectureMistral 7B decoder-only with bidirectional attention
Parameters7B
LicenseMIT
Max Sequence Length512 tokens

about this model

LLM2Vec-Mistral-7B-Instruct-v2-mntp-unsup-simcse is an embedding model that converts a decoder-only large language model into a text encoder for generating dense vector representations of text. Built on the LLM2Vec recipe, it transforms the base Mistral 7B Instruct v2 model through three steps: enabling bidirectional attention, masked next token prediction (MNTP), and unsupervised contrastive learning (SimCSE). The result is a universal text encoder that operates without requiring supervised data.

Key strengths

The model leverages the representational power of a 7B-parameter decoder-only LLM while adding bidirectional context, making it effective for semantic similarity, retrieval, and clustering tasks. The unsupervised SimCSE training aligns embeddings without labelled pairs, preserving the model's general language understanding. The approach is parameter-efficient, using LoRA adapters that are merged into the base model.

Benchmark performance

On the Massive Text Embeddings Benchmark (MTEB), LLM2Vec applied to Mistral 7B achieves a new unsupervised state-of-the-art performance. The model also outperforms encoder-only architectures on word-level tasks by a large margin, as demonstrated in the LLM2Vec paper (arXiv:2404.05961). These results confirm that decoder-only LLMs can be effectively repurposed as strong text encoders without expensive adaptation or synthetic data.

Embedding methodology

The model supports instruction-based encoding for queries and plain encoding for documents, with mean pooling and a maximum sequence length of 512 tokens. It is designed for use cases such as information retrieval, text classification, and semantic search.

Hosted on Gigarouter, this model is available as an OpenAI-compatible API, eliminating the need for local infrastructure or manual model loading.

best for

FAQ

What is LLM2Vec Mistral 7B Instruct v2?

It is an unsupervised text embedding model based on Mistral 7B, trained with the LLM2Vec recipe that enables bidirectional attention and masked next token prediction.

How does it compare to other embedding models?

It achieves state-of-the-art unsupervised performance on MTEB, outperforming encoder-only models on word-level tasks.

What input format does it require?

For queries, pass an instruction followed by the query string; for documents, pass only the document text. Encode with max 512 tokens.

How can I use this model via API?

Use gigarouter's OpenAI-compatible endpoint with a valid API key to send embedding requests.

not yet live

We're benchmarking and onboarding LLM2Vec Mistral 7B Instruct v2 as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related embeddings models

compare all →