LLM2Vec Meta Llama 3 8B Instruct Supervised

McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp-supervised

published Apr 2024 · updated Apr 2024

LLM2Vec Meta Llama 3 8B Instruct Supervised is a text embedding model that converts a decoder-only LLM into a bidirectional encoder using masked next token prediction and supervised contrastive learning.

status

coming soon

API providers

downloads / mo

112.3K

license

mit

specs

Task	Text Embedding (dense retrieval)
Architecture	Meta-Llama-3-8B-Instruct with bidirectional attention and LoRA adapters
Parameters	8B
License	MIT
Pooling	Mean (default)
Max Sequence Length	512 tokens

about this model

LLM2Vec-Meta-Llama-3-8B-Instruct-mntp-supervised is a text embedding model that converts a decoder-only large language model into a bidirectional encoder using the LLM2Vec recipe, then fine-tuned with supervised contrastive learning. It is built on Meta-Llama-3-8B-Instruct and integrates bidirectional attention, masked next token prediction, and unsupervised contrastive learning, with an additional supervised LoRA adapter trained on public E5 data.

Key Strengths

The LLM2Vec approach enables any decoder-only LLM (from 1.3B to 8B parameters) to serve as a universal text encoder without expensive adaptation or synthetic data. This specific model achieves state-of-the-art performance on the Massive Text Embeddings Benchmark (MTEB) among models trained exclusively on publicly available data (as of May 24, 2024). It also sets a new unsupervised state-of-the-art on MTEB and outperforms encoder-only models by a large margin on word-level tasks.

Benchmark Performance

On MTEB, the model reaches state-of-the-art results in both unsupervised and supervised (public-data) settings. Word-level benchmarks show substantial gains over traditional encoder-only architectures. All evaluations are documented in the LLM2Vec paper (accepted to COLM 2024).

Technical Details

Pooling: mean pooling, default max sequence length 512 tokens.
Bidirectional attention enabled by default.
License: MIT.
Repository: github.com/McGill-NLP/llm2vec
Paper: arXiv:2404.05961

best for

·Semantic search and retrieval-augmented generation (RAG)
·Document clustering and similarity comparison
·Sentence and passage embedding for classification

FAQ

What is the input format for this model?

It accepts pairs of instruction and query text for queries, and plain text for documents. Both are encoded as sequences of up to 512 tokens with mean pooling.

How do I call this model via the gigarouter API?

Use the OpenAI-compatible endpoint with your API key, setting the model to "LLM2Vec Meta Llama 3 8B Instruct Supervised" and sending a request with input text and optional instruction.

What is the maximum sequence length?

The maximum sequence length is 512 tokens (default in the LLM2Vec wrapper).

What license is this model released under?

It is released under the MIT License.

How does this model compare to other embedding models?

It achieves state-of-the-art results on the MTEB benchmark among models trained only on publicly available data, outperforming encoder-only models on word-level tasks.

not yet live

We're benchmarking and onboarding LLM2Vec Meta Llama 3 8B Instruct Supervised as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related embeddings models

compare all →

nomic-embed-text-v1.5