skip to content
gigarouter gigarouter
models / embeddings · coming soon

Udever BLOOM 560M

izhx/udever-bloom-560m

published Oct 2023 · updated Nov 2023

Udever BLOOM 560M is a universal embedding model finetuned from BLOOM-560M via BitFit on MS MARCO, SNLI, and MultiNLI data for cross-task and cross-language embedding.

status
coming soon
API providers
0
downloads / mo
334
license
bigscience-bloom-rail-1.0

specs

TaskEmbedding (universal text and code embedding)
ArchitectureDecoder-only Transformer (BLOOM-560M)
Parameters560 million
LicenseNot specified in model card

about this model

udever-bloom-560m is an embedding model fine-tuned from the BLOOM-560m decoder-only language model via BitFit on MS MARCO Passage Ranking, SNLI, and MultiNLI data. It is designed as a universal embedder across tasks, natural languages, and programming languages. Developed by Alibaba Group and described in the paper “Language Models are Universal Embedders” (accepted at the XLLM Workshop, ACL 2025), the model uses contrastive loss with hard negatives and supports both query and document encoding via special tokens ([BOQ], [EOQ], [BOD], [EOD]).

Training

Training used a batch size of 1024 over 3 epochs with AdamW optimizer, learning rate 1e‑4, constant schedule with 0.25‑epoch warmup, and tf32 precision on Nvidia A100 80GB GPUs.

Benchmarks

Massive Text Embedding Benchmark (MTEB) – 56 datasets:

AvgClass.Clust.PairClass.Rerank.Retr.STSSumm.
55.8068.0436.8981.0552.6041.1979.9332.06

CodeSearchNet – semantic code search (6 languages):

GoRubyPythonJavaJSPHPAvg.
75.3866.6796.2378.9969.3973.6976.73

Multi‑cpr (Chinese multi‑domain retrieval) – E‑commerce MRR@10 0.156, Entertainment video MRR@10 0.149, Medical MRR@10 0.245. For full per‑dataset MTEB breakdowns, see the HuggingFace model page.

UDever training illustration showing the model architecture and contrastive learning setup

best for

FAQ

What is Udever BLOOM 560M best used for?

It is a universal embedding model for text and code, supporting tasks like retrieval, classification, clustering, and reranking across multiple natural and programming languages.

How does this model compare to larger Udever variants?

Udever BLOOM 560M has 560M parameters and achieves an average MTEB score of 55.80, while larger variants like the 7B model score 60.63. It is faster and more lightweight.

What is the license for this model?

The model card does not specify a license.

What input format does the model expect?

Queries should be prefixed with [BOQ] and suffixed with [EOQ]; documents with [BOD] and [EOD]. The model uses a decoder-only BLOOM tokenizer.

How can I call this model via the gigarouter API?

Use the gigarouter OpenAI-compatible endpoint with your API key, sending prompts with the required special tokens as input.

not yet live

We're benchmarking and onboarding Udever BLOOM 560M as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related embeddings models

compare all →