skip to content
gigarouter gigarouter
models / embeddings · coming soon

GTE Qwen2 1.5B Instruct

Alibaba-NLP/gte-Qwen2-1.5B-instruct

published Jun 2024 · updated May 2025

GTE Qwen2 1.5B Instruct is a multilingual text embedding model built on Qwen2-1.5B with instruction tuning, producing 1536-dimensional embeddings for retrieval, classification, clustering, and more.

est. price
~$0.008
/ 1M tokens · estimated, set at launch
API providers
0
downloads / mo
772.7K
license
apache-2.0

specs

TaskText Embedding / Sentence Similarity
ArchitectureQwen2-1.5B decoder with bidirectional attention
Parameters1.5B
Embedding Dimension1536

about this model

gte-Qwen2-1.5B-instruct is an embedding model that converts text into dense vector representations for retrieval, clustering, classification, and semantic similarity tasks. Built on the Qwen2-1.5B language model, it integrates bidirectional attention mechanisms and instruction tuning applied solely on the query side, enabling nuanced context understanding without sacrificing efficiency. The model is trained on a large multilingual corpus spanning diverse domains using both weakly supervised and supervised data, supporting up to 32,000 input tokens and producing 1,536-dimensional embeddings.

Overview

This model belongs to the gte (General Text Embedding) family and employs multi-stage contrastive learning to unify a wide range of NLP tasks into a single embedding space. It is designed for production use cases that require high-quality embeddings across languages and modalities, including code retrieval.

Benchmark Performance

The model achieves competitive results on the MTEB (English, 56 tasks) and C-MTEB (Chinese, 35 tasks) benchmarks, as well as on French and Polish subsets. The following table compares its performance with other leading embedding models:

Model NameMTEB(56)C-MTEB(35)MTEB-fr(26)MTEB-pl(26)
bge-base-en-1.564.23---
bge-large-en-1.563.55---
gte-large-en-v1.565.39---
gte-base-en-v1.564.11---
mxbai-embed-large-v164.68---
acge_text_embedding-69.07--
stella-mrl-large-zh-v3.5-1792d-68.55--
gte-large-zh-66.72--
multilingual-e5-base59.4556.21--
multilingual-e5-large61.5058.81--
e5-mistral-7b-instruct66.6360.81--
gte-Qwen1.5-7B-instruct67.3469.52--
NV-Embed-v169.32---
gte-Qwen2-7B-instruct70.2472.0568.2567.86
gte-Qwen2-1.5B-instruct67.1667.6566.6064.04

With a 1.5B parameter footprint and 6.62 GB memory usage (fp32), this model delivers strong multilingual embedding quality suitable for resource-constrained deployments.

best for

FAQ

What is the embedding dimension of GTE Qwen2 1.5B Instruct?

It produces 1536-dimensional embeddings.

What is the maximum input token length?

It supports up to 32,000 input tokens.

How do I use instruction tuning with this model?

Instruction tuning is applied only on the query side using a prompt like "Instruct: {task_description}\nQuery: {query}". Pre-built prompt names are available in the config.

What languages does the model support?

It is multilingual, trained on a vast multilingual corpus covering diverse languages and domains.

How can I call this model via the gigarouter API?

Use the OpenAI-compatible endpoint with an API key. Send a POST request with input texts and the model name "gte-qwen2-1.5b-instruct".

not yet live

We're benchmarking and onboarding GTE Qwen2 1.5B Instruct as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related embeddings models

compare all →