Question 1

What is this model best for?

Accepted Answer

It is best for generating dense text embeddings for semantic search, retrieval, clustering, and classification. It achieves state-of-the-art performance on the MTEB benchmark for both unsupervised and supervised settings using only publicly available data.

Question 2

How does its size and speed compare to other embedding models?

Accepted Answer

At 7 billion parameters, it is larger than most BERT-style encoders but significantly smaller than full-size LLMs. Speed is slower than lightweight encoders but benefits from richer contextual representations.

Question 3

What license does this model use?

Accepted Answer

The model and its associated repository are released under the MIT license, allowing free use, modification, and distribution.

Question 4

What input format does the model expect?

Accepted Answer

The model expects text input with an instruction prefix for queries (e.g., "Given a web search query, retrieve relevant passages that answer the query:") and plain text for documents. The maximum input length is 512 tokens.

Question 5

How can I call this model via the gigarouter API?

Accepted Answer

Use the gigarouter OpenAI-compatible endpoint with your API key. Pass the input text and specify the model name "LLM2Vec-Mistral-7B-Instruct-v2-mntp-supervised". The API returns a vector embedding.

Task	Text Embedding
Architecture	Bidirectional decoder-only transformer with LoRA adapters (Mistral-7B base)
Parameters	7B
License	MIT
Pooling	Mean
Max Sequence Length	512 tokens

LLM2Vec Mistral 7B Instruct v2 Supervised

specs

about this model

best for

FAQ

related embeddings models