Lodestone Base 4096 V1

Hum-Works/lodestone-base-4096-v1

published Aug 2023 · updated Oct 2023

Lodestone Base 4096 V1 is a sentence embedding model that maps long sentences and paragraphs to a 768-dimensional dense vector space for clustering and semantic search.

status

coming soon

API providers

downloads / mo

235

license

apache-2.0

specs

Task	Text Embedding
Architecture	Mosaic BERT with FlashAttention, ALiBi, GLU
Max Sequence Length	4096 tokens
Output Dimensions	768
Pooling	Mean pooling

about this model

lodestone-base-4096-v1 is an embedding model that maps long sentences and paragraphs to a 768-dimensional dense vector space for tasks such as semantic search, clustering, and similarity comparison.

Architecture and Capabilities

The model is based on a fine-tuned mosaic-bert-base-seqlen-2048 backbone and incorporates FlashAttention, Attention with Linear Biases (ALiBi), and Gated Linear Units (GLU). These architectural improvements allow the model to accept input sequences of up to 4096 tokens — eight times longer than most comparable sentence embedding models — while maintaining a manageable size suitable for both GPU and CPU inference. ALiBi enables extrapolation beyond the training length: the model was fine-tuned with a maximum sequence length of 2048 tokens but supports inference at 4096 tokens.

Training Data and Procedure

The model was fine-tuned on nearly 1.5 billion sentence pairs using a contrastive learning objective. The training data is a weighted sample from multiple public datasets, including Reddit comments (726M pairs), S2ORC citation pairs with abstracts (252M pairs), Reddit title-body pairs (127M pairs), Amazon reviews (87M pairs), WikiAnswers, PAQ, Stack Exchange, MS MARCO, and others. Full dataset details and weights are available in the original model repository.

Performance and Usage Notes

On standard text embedding evaluation benchmarks, lodestone-base-4096-v1 achieves performance comparable to widely used models such as all-mpnet-base-v2 while supporting a longer input context. The model outputs normalized embeddings via mean pooling over token representations. When used through the gigarouter API, no local installation or dependency management is required; the model is available as an OpenAI-compatible endpoint.

best for

·Semantic search and information retrieval on long documents
·Clustering of text data (e.g., paragraphs)
·Sentence similarity and paraphrase detection

FAQ

What is the maximum input length for Lodestone Base 4096 V1?

The model supports input sequences up to 4096 tokens.

How does this model compare to typical sentence transformers like all-MiniLM-L6-v2?

Lodestone Base 4096 V1 supports 8x longer sequences (4096 vs 512) while achieving comparable performance on standard benchmarks.

What format does the embedding output have?

The output is a single 768-dimensional dense vector that is L2-normalized.

How can I call this model via the gigarouter API?

Use the OpenAI-compatible endpoint with your API key, specifying the model name as "lodestone-base-4096-v1". Send a POST request with input text and receive embeddings.

What training data was used for this model?

The model was fine-tuned on nearly 1.5 billion sentence pairs from datasets including Reddit comments, S2ORC citations, Amazon reviews, and many others.

not yet live

We're benchmarking and onboarding Lodestone Base 4096 V1 as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related embeddings models

compare all →

nomic-embed-text-v1.5