skip to content
gigarouter gigarouter
models / embeddings · coming soon

Jina Embedding S v1

jinaai/jina-embedding-s-en-v1

published Jul 2023 · updated Jan 2025

Jina Embedding S v1 is a text embedding model that converts sentences into 512-dimensional vectors for semantic search, information retrieval, and similarity tasks.

status
coming soon
API providers
0
downloads / mo
691
license
apache-2.0

specs

TaskEmbedding (Sentence Similarity)
Parameters35 million
Dimension512
LicenseApache-2.0

about this model

jina-embedding-s-en-v1 is a text embedding model that maps text to 512-dimensional vectors, trained by Jina AI on the Linnaeus-Clean dataset of 380 million sentence pairs drawn from diverse domains. With 35 million parameters, it balances inference speed and semantic quality for tasks such as information retrieval, semantic textual similarity, and reranking.

Illustration of the Jina AI text embedding set

Benchmark performance

The model is evaluated on standard STS and retrieval benchmarks. Compared to all-minilm-l6-v2, all-mpnet-base-v2, and OpenAI’s ada-embedding-002:

ModelParametersDimension
all-minilm-l6-v223M384
all-mpnet-base-v2110M768
ada-embedding-002Unknown (API)1536
jina-embedding-s-en-v135M512
Benchmarkall-minilm-l6-v2all-mpnet-base-v2ada-002jina-s-en-v1
STS120.7240.7260.6980.743
STS130.8060.8350.8330.786
STS140.7560.780.7610.738
STS150.8540.8570.8610.837
STS160.790.800.860.80
STS170.8760.9060.9030.875
TRECOVID0.4730.5130.6850.523
Quora0.8760.8750.8760.857
SciFact0.6450.6560.7260.524

Additional evaluation

On the full MTEB benchmark (additional source), the model achieves, for example, ArguAna Retrieval NDCG@10 of 43.57, BIOSSES STS Cosine Spearman of 82.96, and Banking77 Classification accuracy of 74.64. The training also incorporates the jinaai/negation-dataset to improve handling of negated statements. Licensed under Apache-2.0.

best for

FAQ

What is the primary use case for Jina Embedding S v1?

It is designed for semantic search, information retrieval, and semantic textual similarity, converting text into dense 512-dimensional embeddings.

How does Jina Embedding S v1 compare to other Jina embedding models?

It is the 35M parameter variant in the Jina Embeddings v1 family, offering a balance between speed and accuracy. Smaller: t (14M, 312-dim). Larger: b (110M, 768-dim) and l (330M, 1024-dim).

How can I call this model via the gigarouter API?

Use the OpenAI-compatible endpoint with your API key. Send a POST request with the input text and model name to the gigarouter embeddings endpoint.

What license is this model released under?

Apache-2.0, allowing free use, modification, and distribution.

not yet live

We're benchmarking and onboarding Jina Embedding S v1 as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related embeddings models

compare all →