skip to content
gigarouter gigarouter
models / embeddings · coming soon

Jina Embeddings V5 Text Nano

jinaai/jina-embeddings-v5-text-nano

published Jan 2026 · updated Apr 2026

Jina Embeddings V5 Text Nano is an embedding model that generates multilingual text embeddings for retrieval, text-matching, clustering, and classification tasks.

est. price
~$0.008
/ 1M tokens · estimated, set at launch
API providers
0
downloads / mo
1.1M
license
cc-by-nc-4.0

specs

TaskText Embedding (retrieval, text-matching, clustering, classification)
ArchitectureEuroBERT-210M (base)
Parameters239M
Max Sequence Length8192
Embedding Dimension768

about this model

jina-embeddings-v5-text-nano is a text embedding model that generates multilingual dense vector representations for retrieval, text-matching, clustering, and classification tasks, hosted as a managed API on gigarouter.

Built on EuroBERT-210M with 239M parameters, it achieves a 71.0 average on MTEB English v2 and 65.5 on MMTEB, matching or exceeding all other sub-500M embedding models including KaLM-mini-v2.5 (494M) and Gemma-300M (308M). The model is trained by combining embedding distillation from Qwen3-Embedding-4B with task-specific contrastive losses, producing compact yet high-performance embeddings.

It supports multilingual text up to 32K tokens and produces embeddings that remain robust under truncation and binary quantization. Embedding dimension is 768 with matryoshka dimensions at 32, 64, 128, 256, 512, and 768. Pooling uses last-token pooling.

Performance highlights

  • MTEB English v2: 71.0
  • MMTEB: 65.5
  • Parameters: 239M – smallest model to match top sub-500M results
jina-embeddings-v5-text-nano logo Model overview with key metrics Evaluation results visualization

For full training details and benchmark methodology, see the technical report.

best for

FAQ

What is the maximum input length?

The maximum sequence length is 8192 tokens.

What embedding dimensions are available?

The model outputs 768-dimensional embeddings, and supports Matryoshka dimensions of 32, 64, 128, 256, 512, and 768.

How does this model compare to larger embedding models?

With only 239M parameters, it matches or exceeds all other sub-500M embedding models including KaLM-mini-v2.5 (494M) and Gemma-300M (308M) on MTEB/MMTEB.

What is the input format for the gigarouter API?

Send text (or text with task parameter) to the OpenAI-compatible endpoint with your API key. The model supports tasks: retrieval, text-matching, clustering, classification.

Is the model available under an open license?

The model card does not specify a license.

not yet live

We're benchmarking and onboarding Jina Embeddings V5 Text Nano as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related embeddings models

compare all →