Lodestone Base 4096 V1
Hum-Works/lodestone-base-4096-v1
published Aug 2023 · updated Oct 2023
Lodestone Base 4096 V1 is a sentence embedding model that maps long sentences and paragraphs to a 768-dimensional dense vector space for clustering and semantic search.
specs
| Task | Text Embedding |
| Architecture | Mosaic BERT with FlashAttention, ALiBi, GLU |
| Max Sequence Length | 4096 tokens |
| Output Dimensions | 768 |
| Pooling | Mean pooling |
about this model
lodestone-base-4096-v1 is an embedding model that maps long sentences and paragraphs to a 768-dimensional dense vector space for tasks such as semantic search, clustering, and similarity comparison.
Architecture and Capabilities
The model is based on a fine-tuned mosaic-bert-base-seqlen-2048 backbone and incorporates FlashAttention, Attention with Linear Biases (ALiBi), and Gated Linear Units (GLU). These architectural improvements allow the model to accept input sequences of up to 4096 tokens — eight times longer than most comparable sentence embedding models — while maintaining a manageable size suitable for both GPU and CPU inference. ALiBi enables extrapolation beyond the training length: the model was fine-tuned with a maximum sequence length of 2048 tokens but supports inference at 4096 tokens.
Training Data and Procedure
The model was fine-tuned on nearly 1.5 billion sentence pairs using a contrastive learning objective. The training data is a weighted sample from multiple public datasets, including Reddit comments (726M pairs), S2ORC citation pairs with abstracts (252M pairs), Reddit title-body pairs (127M pairs), Amazon reviews (87M pairs), WikiAnswers, PAQ, Stack Exchange, MS MARCO, and others. Full dataset details and weights are available in the original model repository.
Performance and Usage Notes
On standard text embedding evaluation benchmarks, lodestone-base-4096-v1 achieves performance comparable to widely used models such as all-mpnet-base-v2 while supporting a longer input context. The model outputs normalized embeddings via mean pooling over token representations. When used through the gigarouter API, no local installation or dependency management is required; the model is available as an OpenAI-compatible endpoint.
best for
- ·Semantic search and information retrieval on long documents
- ·Clustering of text data (e.g., paragraphs)
- ·Sentence similarity and paraphrase detection
FAQ
The model supports input sequences up to 4096 tokens.
Lodestone Base 4096 V1 supports 8x longer sequences (4096 vs 512) while achieving comparable performance on standard benchmarks.
The output is a single 768-dimensional dense vector that is L2-normalized.
Use the OpenAI-compatible endpoint with your API key, specifying the model name as "lodestone-base-4096-v1". Send a POST request with input text and receive embeddings.
The model was fine-tuned on nearly 1.5 billion sentence pairs from datasets including Reddit comments, S2ORC citations, Amazon reviews, and many others.
We're benchmarking and onboarding Lodestone Base 4096 V1 as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.