skip to content
gigarouter gigarouter
models / embeddings · coming soon

LLM2Vec Llama 2 7B Chat Unsupervised SimCSE

McGill-NLP/LLM2Vec-Llama-2-7b-chat-hf-mntp-unsup-simcse

published Apr 2024 · updated Apr 2024

LLM2Vec Llama 2 7B Chat Unsupervised SimCSE is an embed model that converts decoder-only LLMs into text encoders using bidirectional attention, masked next token prediction, and unsupervised contrastive learning.

status
coming soon
API providers
0
downloads / mo
13
license
mit

specs

TaskText Embedding
ArchitectureLLM2Vec (Bidirectional Llama-2-7b with MNTP + Unsupervised SimCSE)
Parameters7B
LicenseMIT

about this model

LLM2Vec-Llama-2-7b-chat-hf-mntp-unsup-simcse is an embedding model that converts a decoder-only large language model into a text encoder using a three-step unsupervised recipe: enabling bidirectional attention, masked next token prediction, and unsupervised contrastive learning (SimCSE). Built on Llama-2-7b-chat, it produces dense vector representations for tasks such as semantic search, retrieval, and clustering.

Key Strengths

This model achieves state-of-the-art unsupervised performance on the Massive Text Embeddings Benchmark (MTEB). It also outperforms encoder-only models by a large margin on word-level tasks. When further fine-tuned with supervised contrastive learning, LLM2Vec reaches state-of-the-art MTEB results among models trained exclusively on publicly available data (as of May 24, 2024). The transformation is parameter-efficient: it applies LoRA adapters without requiring expensive adaptation or synthetic data.

Benchmark Results

The model reaches a new unsupervised state-of-the-art on MTEB. Details on individual MTEB task scores are available in the original paper (COLM 2024).

Architecture Overview

The figure illustrates the three-step conversion process applied to a decoder-only LLM to produce a bidirectional text encoder.

Licensing and Availability

Released under the MIT license. The work was published at COLM 2024. The model is hosted on gigarouter as a managed API, eliminating the need for local setup or GPU infrastructure.

best for

FAQ

What input format does the model expect?

Accepts a two-part query with an instruction and text, or a document text. Uses mean pooling over token embeddings.

What is the maximum sequence length?

512 tokens.

What license is this model released under?

MIT License.

How do I use this model via the gigarouter API?

Call the OpenAI-compatible endpoint with your gigarouter API key and pass the input text as specified in the documentation.

How does this model compare to other embedding models?

It achieves unsupervised state-of-the-art performance on the MTEB benchmark, outperforming encoder-only models on word-level tasks.

not yet live

We're benchmarking and onboarding LLM2Vec Llama 2 7B Chat Unsupervised SimCSE as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related embeddings models

compare all →