LLM2Vec Llama 2 7B Chat Unsupervised SimCSE
McGill-NLP/LLM2Vec-Llama-2-7b-chat-hf-mntp-unsup-simcse
published Apr 2024 · updated Apr 2024
LLM2Vec Llama 2 7B Chat Unsupervised SimCSE is an embed model that converts decoder-only LLMs into text encoders using bidirectional attention, masked next token prediction, and unsupervised contrastive learning.
specs
| Task | Text Embedding |
| Architecture | LLM2Vec (Bidirectional Llama-2-7b with MNTP + Unsupervised SimCSE) |
| Parameters | 7B |
| License | MIT |
about this model
LLM2Vec-Llama-2-7b-chat-hf-mntp-unsup-simcse is an embedding model that converts a decoder-only large language model into a text encoder using a three-step unsupervised recipe: enabling bidirectional attention, masked next token prediction, and unsupervised contrastive learning (SimCSE). Built on Llama-2-7b-chat, it produces dense vector representations for tasks such as semantic search, retrieval, and clustering.
Key Strengths
This model achieves state-of-the-art unsupervised performance on the Massive Text Embeddings Benchmark (MTEB). It also outperforms encoder-only models by a large margin on word-level tasks. When further fine-tuned with supervised contrastive learning, LLM2Vec reaches state-of-the-art MTEB results among models trained exclusively on publicly available data (as of May 24, 2024). The transformation is parameter-efficient: it applies LoRA adapters without requiring expensive adaptation or synthetic data.
Benchmark Results
The model reaches a new unsupervised state-of-the-art on MTEB. Details on individual MTEB task scores are available in the original paper (COLM 2024).
Architecture Overview
The figure illustrates the three-step conversion process applied to a decoder-only LLM to produce a bidirectional text encoder.
Licensing and Availability
Released under the MIT license. The work was published at COLM 2024. The model is hosted on gigarouter as a managed API, eliminating the need for local setup or GPU infrastructure.
best for
- ·Retrieving relevant passages for web search queries
- ·Document similarity and semantic search
- ·Unsupervised sentence embedding for clustering or classification
FAQ
Accepts a two-part query with an instruction and text, or a document text. Uses mean pooling over token embeddings.
512 tokens.
MIT License.
Call the OpenAI-compatible endpoint with your gigarouter API key and pass the input text as specified in the documentation.
It achieves unsupervised state-of-the-art performance on the MTEB benchmark, outperforming encoder-only models on word-level tasks.
We're benchmarking and onboarding LLM2Vec Llama 2 7B Chat Unsupervised SimCSE as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.