GTE Qwen2 1.5B Instruct

Alibaba-NLP/gte-Qwen2-1.5B-instruct

published Jun 2024 · updated May 2025

GTE Qwen2 1.5B Instruct is a multilingual text embedding model built on Qwen2-1.5B with instruction tuning, producing 1536-dimensional embeddings for retrieval, classification, clustering, and more.

est. price

~$0.008

/ 1M tokens · estimated, set at launch

API providers

downloads / mo

772.7K

license

apache-2.0

specs

Task	Text Embedding / Sentence Similarity
Architecture	Qwen2-1.5B decoder with bidirectional attention
Parameters	1.5B
Embedding Dimension	1536

about this model

gte-Qwen2-1.5B-instruct is an embedding model that converts text into dense vector representations for retrieval, clustering, classification, and semantic similarity tasks. Built on the Qwen2-1.5B language model, it integrates bidirectional attention mechanisms and instruction tuning applied solely on the query side, enabling nuanced context understanding without sacrificing efficiency. The model is trained on a large multilingual corpus spanning diverse domains using both weakly supervised and supervised data, supporting up to 32,000 input tokens and producing 1,536-dimensional embeddings.

Overview

This model belongs to the gte (General Text Embedding) family and employs multi-stage contrastive learning to unify a wide range of NLP tasks into a single embedding space. It is designed for production use cases that require high-quality embeddings across languages and modalities, including code retrieval.

Benchmark Performance

The model achieves competitive results on the MTEB (English, 56 tasks) and C-MTEB (Chinese, 35 tasks) benchmarks, as well as on French and Polish subsets. The following table compares its performance with other leading embedding models:

Model Name	MTEB(56)	C-MTEB(35)	MTEB-fr(26)	MTEB-pl(26)
bge-base-en-1.5	64.23	-	-	-
bge-large-en-1.5	63.55	-	-	-
gte-large-en-v1.5	65.39	-	-	-
gte-base-en-v1.5	64.11	-	-	-
mxbai-embed-large-v1	64.68	-	-	-
acge_text_embedding	-	69.07	-	-
stella-mrl-large-zh-v3.5-1792d	-	68.55	-	-
gte-large-zh	-	66.72	-	-
multilingual-e5-base	59.45	56.21	-	-
multilingual-e5-large	61.50	58.81	-	-
e5-mistral-7b-instruct	66.63	60.81	-	-
gte-Qwen1.5-7B-instruct	67.34	69.52	-	-
NV-Embed-v1	69.32	-	-	-
gte-Qwen2-7B-instruct	70.24	72.05	68.25	67.86
gte-Qwen2-1.5B-instruct	67.16	67.65	66.60	64.04

With a 1.5B parameter footprint and 6.62 GB memory usage (fp32), this model delivers strong multilingual embedding quality suitable for resource-constrained deployments.

best for

·Multilingual semantic search and retrieval-augmented generation (RAG)
·Text classification and clustering across diverse domains
·Code retrieval by treating code as natural language

FAQ

What is the embedding dimension of GTE Qwen2 1.5B Instruct?

It produces 1536-dimensional embeddings.

What is the maximum input token length?

It supports up to 32,000 input tokens.

How do I use instruction tuning with this model?

Instruction tuning is applied only on the query side using a prompt like "Instruct: {task_description}\nQuery: {query}". Pre-built prompt names are available in the config.

What languages does the model support?

It is multilingual, trained on a vast multilingual corpus covering diverse languages and domains.

How can I call this model via the gigarouter API?

Use the OpenAI-compatible endpoint with an API key. Send a POST request with input texts and the model name "gte-qwen2-1.5b-instruct".

not yet live

We're benchmarking and onboarding GTE Qwen2 1.5B Instruct as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related embeddings models

compare all →

nomic-embed-text-v1.5