Nomic Embed Text V1

nomic-ai/nomic-embed-text-v1

published Jan 2024 · updated Apr 2026

Nomic Embed Text V1 is a text embedding model with 8192 token context length that surpasses OpenAI text-embedding-ada-002 and text-embedding-3-small on short and long context tasks.

est. price

~$0.008

/ 1M tokens · estimated, set at launch

API providers

downloads / mo

4.2M

license

apache-2.0

specs

Task	Text Embeddings
Architecture	BERT-based
Context Length	8192 tokens
License	Apache 2.0

about this model

nomic-embed-text-v1 is a text embedding model that encodes English texts into dense vector representations with a context length of 8,192 tokens. It is a fully open and reproducible model, released under the Apache 2.0 license, with public model weights, training code, training data, and data curation scripts.

The model outperforms OpenAI text-embedding-ada-002 and text-embedding-3-small on both short-context and long-context benchmarks. Its multi-stage training pipeline is built on Flash Attention and supports Matryoshka Representation Learning for flexible embedding sizes. An aligned vision embedding model (nomic-embed-vision-v1) enables multimodal retrieval: any text embedding can be used with vision embeddings in the same latent space. The paper has been accepted to TMLR (Transactions on Machine Learning Research).

Benchmark Results

Name	SeqLen	MTEB	LoCo

best for

·Retrieval-augmented generation (RAG) with long documents
·Semantic clustering and topic discovery
·Text classification feature extraction

FAQ

What task prefix should I use for RAG queries?

Use <code>search_query:</code> for queries and <code>search_document:</code> for documents.

What is the output embedding dimension?

The model card does not specify the dimension; it is a BERT-based model with mean pooling.

Is the model open-source?

Yes, fully reproducible with open weights, open training code, and open data under Apache 2.0.

How can I call this model via the API?

Use the gigarouter OpenAI-compatible endpoint with your API key and include the appropriate task prefix.

not yet live

We're benchmarking and onboarding Nomic Embed Text V1 as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related embeddings models

compare all →

nomic-embed-text-v1.5

granite-embedding-small-english-r2

2.2M dl/mo