skip to content
gigarouter gigarouter
models / embeddings · coming soon

INF Retriever V1 1.5B

infly/inf-retriever-v1-1.5b

published Feb 2025 · updated Jan 2026

INF Retriever V1 1.5B is a lightweight dense retrieval model optimized for Chinese and English, built on GTE-Qwen2-1.5B-instruct and fine-tuned for high-performance retrieval tasks.

est. price
~$0.008
/ 1M tokens · estimated, set at launch
API providers
0
downloads / mo
4.3K
license
apache-2.0

specs

TaskEmbedding / Dense Retrieval
ArchitectureTransformer (based on GTE-Qwen2-1.5B-instruct)
Parameters1.5B
LicenseApache-2.0
Embedding Dimension1536
Max Input Tokens32768

best for

FAQ

What is the embedding dimension of INF Retriever V1 1.5B?

The embedding dimension is 1536.

What is the maximum input token length?

The model supports a maximum of 32768 tokens.

What license is this model released under?

It is released under Apache-2.0.

How does this 1.5B model compare to the larger INF-Retriever-v1 (7B)?

It is a lighter version that still ranks No.1 on the AIR-Bench bilingual sub-leaderboard among models with fewer than 7B parameters, making it a top choice for efficient bilingual retrieval.

How can I use this model via the gigarouter API?

Send requests to the OpenAI-compatible endpoint with your API key; use the model name as provided in the deployment.

not yet live

We're benchmarking and onboarding INF Retriever V1 1.5B as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related embeddings models

compare all →