skip to content
gigarouter gigarouter
models / embeddings · coming soon

LLM2Vec Mistral 7B Instruct v2 Supervised

McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp-supervised

published Apr 2024 · updated Apr 2024

LLM2Vec Mistral 7B Instruct v2 Supervised is a text embedding model that converts the Mistral-7B-Instruct-v2 decoder-only LLM into a powerful text encoder using bidirectional attention, masked next token prediction, and supervised contrastive learning.

status
coming soon
API providers
0
downloads / mo
291
license
mit

specs

TaskText Embedding
ArchitectureBidirectional decoder-only transformer with LoRA adapters (Mistral-7B base)
Parameters7B
LicenseMIT
PoolingMean
Max Sequence Length512 tokens

about this model

McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp-supervised is an embedding model that transforms the Mistral-7B-Instruct-v2 decoder-only language model into a text encoder using the LLM2Vec recipe. This recipe consists of three steps: enabling bidirectional attention, training with masked next token prediction (MNTP), and unsupervised contrastive learning. The supervised variant further fine-tunes the MNTP base with contrastive learning on publicly available E5 data, yielding state-of-the-art performance on the Massive Text Embeddings Benchmark (MTEB) among models trained solely on public data (as of May 24, 2024). It also outperforms traditional encoder-only models by a large margin on word-level tasks.

The model uses LoRA (PEFT) weights, default mean pooling, and a maximum sequence length of 512 tokens. The underlying approach is parameter-efficient, requiring no synthetic GPT-4 generated data or expensive full adaptation. The LLM2Vec paper has been accepted to COLM 2024, and the repository is released under the MIT license. When hosted on gigarouter, the model is available as a managed, OpenAI-compatible API — no installation or local inference code is needed.

best for

FAQ

What is this model best for?

It is best for generating dense text embeddings for semantic search, retrieval, clustering, and classification. It achieves state-of-the-art performance on the MTEB benchmark for both unsupervised and supervised settings using only publicly available data.

How does its size and speed compare to other embedding models?

At 7 billion parameters, it is larger than most BERT-style encoders but significantly smaller than full-size LLMs. Speed is slower than lightweight encoders but benefits from richer contextual representations.

What license does this model use?

The model and its associated repository are released under the MIT license, allowing free use, modification, and distribution.

What input format does the model expect?

The model expects text input with an instruction prefix for queries (e.g., "Given a web search query, retrieve relevant passages that answer the query:") and plain text for documents. The maximum input length is 512 tokens.

How can I call this model via the gigarouter API?

Use the gigarouter OpenAI-compatible endpoint with your API key. Pass the input text and specify the model name "LLM2Vec-Mistral-7B-Instruct-v2-mntp-supervised". The API returns a vector embedding.

not yet live

We're benchmarking and onboarding LLM2Vec Mistral 7B Instruct v2 Supervised as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related embeddings models

compare all →