skip to content
gigarouter gigarouter
models / embeddings · coming soon

Nomic Embed Text V1 Ablated

nomic-ai/nomic-embed-text-v1-ablated

published Jan 2024 · updated Aug 2024

Nomic Embed Text V1 Ablated is an 8192 context length text embedding model trained on a modified dataset to study the impact of training data on model outcomes.

status
coming soon
API providers
0
downloads / mo
248

specs

TaskText Embedding
Context Length8192 tokens
LicenseApache 2.0

about this model

nomic-embed-text-v1-ablated is an 8192 context length English text encoder designed for reproducibility research. This checkpoint was trained with a modified dataset to enable analysis of data subsets on model outcomes. It is released under the Apache 2.0 license as part of the Nomic Embed project, with full training code and curated data available for replication (see arXiv:2402.01613).

As a variant of the nomic-embed-text-v1 family, this model is suited for studying the impact of training data composition on embedding quality rather than for production embedding extraction. For that purpose, the final nomic-embed-text-v1 model is recommended by the project authors.

Key attributes:

  • Context length: 8192 tokens.
  • Open-source: weights, training code, and data are publicly available under Apache 2.0.
  • Purpose: facilitates transparency and reproducibility in long-context text embedding research.

No benchmark scores are reported for this ablated checkpoint; its primary value lies in enabling controlled comparisons of training data effects.

best for

FAQ

What is the context length of this model?

8192 tokens.

What license is it released under?

Apache 2.0.

How does it differ from nomic-embed-text-v1?

It is trained on a modified training dataset to understand the impact of data on model outcomes.

Can I use this model for production embeddings?

No; the model card recommends using nomic-embed-text-v1 for extracting embeddings in production.

How do I call this model via the API?

Use the gigarouter OpenAI-compatible endpoint with an API key.

not yet live

We're benchmarking and onboarding Nomic Embed Text V1 Ablated as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related embeddings models

compare all →