GTE Qwen2 1.5B Instruct
Alibaba-NLP/gte-Qwen2-1.5B-instruct
published Jun 2024 · updated May 2025
GTE Qwen2 1.5B Instruct is a multilingual text embedding model built on Qwen2-1.5B with instruction tuning, producing 1536-dimensional embeddings for retrieval, classification, clustering, and more.
specs
| Task | Text Embedding / Sentence Similarity |
| Architecture | Qwen2-1.5B decoder with bidirectional attention |
| Parameters | 1.5B |
| Embedding Dimension | 1536 |
about this model
gte-Qwen2-1.5B-instruct is an embedding model that converts text into dense vector representations for retrieval, clustering, classification, and semantic similarity tasks. Built on the Qwen2-1.5B language model, it integrates bidirectional attention mechanisms and instruction tuning applied solely on the query side, enabling nuanced context understanding without sacrificing efficiency. The model is trained on a large multilingual corpus spanning diverse domains using both weakly supervised and supervised data, supporting up to 32,000 input tokens and producing 1,536-dimensional embeddings.
Overview
This model belongs to the gte (General Text Embedding) family and employs multi-stage contrastive learning to unify a wide range of NLP tasks into a single embedding space. It is designed for production use cases that require high-quality embeddings across languages and modalities, including code retrieval.
Benchmark Performance
The model achieves competitive results on the MTEB (English, 56 tasks) and C-MTEB (Chinese, 35 tasks) benchmarks, as well as on French and Polish subsets. The following table compares its performance with other leading embedding models:
| Model Name | MTEB(56) | C-MTEB(35) | MTEB-fr(26) | MTEB-pl(26) |
|---|---|---|---|---|
| bge-base-en-1.5 | 64.23 | - | - | - |
| bge-large-en-1.5 | 63.55 | - | - | - |
| gte-large-en-v1.5 | 65.39 | - | - | - |
| gte-base-en-v1.5 | 64.11 | - | - | - |
| mxbai-embed-large-v1 | 64.68 | - | - | - |
| acge_text_embedding | - | 69.07 | - | - |
| stella-mrl-large-zh-v3.5-1792d | - | 68.55 | - | - |
| gte-large-zh | - | 66.72 | - | - |
| multilingual-e5-base | 59.45 | 56.21 | - | - |
| multilingual-e5-large | 61.50 | 58.81 | - | - |
| e5-mistral-7b-instruct | 66.63 | 60.81 | - | - |
| gte-Qwen1.5-7B-instruct | 67.34 | 69.52 | - | - |
| NV-Embed-v1 | 69.32 | - | - | - |
| gte-Qwen2-7B-instruct | 70.24 | 72.05 | 68.25 | 67.86 |
| gte-Qwen2-1.5B-instruct | 67.16 | 67.65 | 66.60 | 64.04 |
With a 1.5B parameter footprint and 6.62 GB memory usage (fp32), this model delivers strong multilingual embedding quality suitable for resource-constrained deployments.
best for
- ·Multilingual semantic search and retrieval-augmented generation (RAG)
- ·Text classification and clustering across diverse domains
- ·Code retrieval by treating code as natural language
FAQ
It produces 1536-dimensional embeddings.
It supports up to 32,000 input tokens.
Instruction tuning is applied only on the query side using a prompt like "Instruct: {task_description}\nQuery: {query}". Pre-built prompt names are available in the config.
It is multilingual, trained on a vast multilingual corpus covering diverse languages and domains.
Use the OpenAI-compatible endpoint with an API key. Send a POST request with input texts and the model name "gte-qwen2-1.5b-instruct".
We're benchmarking and onboarding GTE Qwen2 1.5B Instruct as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.