SLX-v0.1
brahmairesearch/slx-v0.1
published Aug 2024 · updated Aug 2024
SLX-v0.1 is an embedding model that maps sentences and paragraphs into a 384-dimensional dense vector space for semantic search, clustering, and similarity tasks.
specs
| Task | Embedding |
| Architecture | MiniLM-L6-H384-uncased (6-layer MiniLM, 384 hidden) |
| Output Dimension | 384 |
| Max Sequence Length | 256 tokens (truncated) |
| License | Not specified |
about this model
SLX-v0.1 is an embedding model that maps sentences and short paragraphs into a 384-dimensional dense vector space for semantic search, clustering, and similarity tasks. It is built on the nreimers/MiniLM-L6-H384-uncased backbone and fine-tuned with a contrastive learning objective followed by transfer learning from dunzhang/stella_en_400M_v5 using an internally curated English dataset. Input text longer than 256 word pieces is truncated by default.
The model produces normalized sentence embeddings optimized for cosine similarity comparison. It is hosted by GigaRouter as a managed, OpenAI-compatible API, requiring no local installation or dependency management.
best for
- ·Semantic search and information retrieval over sentences/paragraphs
- ·Clustering documents by semantic similarity
- ·Sentence similarity comparison
FAQ
Input text longer than 256 word pieces (tokens) will be truncated by default.
It produces a 384-dimensional dense vector for each input sentence or paragraph.
It was fine-tuned using contrastive learning with cross-entropy loss, then transfer learned from the stella_en_400M_v5 model on an internally curated English dataset.
Use the gigarouter OpenAI-compatible endpoint with your API key; send a request with the model name 'brahmairesearch/slx-v0.1' and the input text.
It is based on the pre-trained model nreimers/MiniLM-L6-H384-uncased.
We're benchmarking and onboarding SLX-v0.1 as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.