INF Retriever V1 1.5B
infly/inf-retriever-v1-1.5b
published Feb 2025 · updated Jan 2026
INF Retriever V1 1.5B is a lightweight dense retrieval model optimized for Chinese and English, built on GTE-Qwen2-1.5B-instruct and fine-tuned for high-performance retrieval tasks.
specs
| Task | Embedding / Dense Retrieval |
| Architecture | Transformer (based on GTE-Qwen2-1.5B-instruct) |
| Parameters | 1.5B |
| License | Apache-2.0 |
| Embedding Dimension | 1536 |
| Max Input Tokens | 32768 |
best for
- ·Chinese and English document retrieval and search
- ·Heterogeneous information retrieval across domains (web, healthcare, law, finance, news)
- ·Long-context retrieval tasks requiring up to 32k tokens
FAQ
The embedding dimension is 1536.
The model supports a maximum of 32768 tokens.
It is released under Apache-2.0.
It is a lighter version that still ranks No.1 on the AIR-Bench bilingual sub-leaderboard among models with fewer than 7B parameters, making it a top choice for efficient bilingual retrieval.
Send requests to the OpenAI-compatible endpoint with your API key; use the model name as provided in the deployment.
We're benchmarking and onboarding INF Retriever V1 1.5B as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.