Udever BLOOM 560M

izhx/udever-bloom-560m

published Oct 2023 · updated Nov 2023

Udever BLOOM 560M is a universal embedding model finetuned from BLOOM-560M via BitFit on MS MARCO, SNLI, and MultiNLI data for cross-task and cross-language embedding.

status

coming soon

API providers

downloads / mo

334

license

bigscience-bloom-rail-1.0

specs

Task	Embedding (universal text and code embedding)
Architecture	Decoder-only Transformer (BLOOM-560M)
Parameters	560 million
License	Not specified in model card

about this model

udever-bloom-560m is an embedding model fine-tuned from the BLOOM-560m decoder-only language model via BitFit on MS MARCO Passage Ranking, SNLI, and MultiNLI data. It is designed as a universal embedder across tasks, natural languages, and programming languages. Developed by Alibaba Group and described in the paper “Language Models are Universal Embedders” (accepted at the XLLM Workshop, ACL 2025), the model uses contrastive loss with hard negatives and supports both query and document encoding via special tokens ([BOQ], [EOQ], [BOD], [EOD]).

Training

Training used a batch size of 1024 over 3 epochs with AdamW optimizer, learning rate 1e‑4, constant schedule with 0.25‑epoch warmup, and tf32 precision on Nvidia A100 80GB GPUs.

Benchmarks

Massive Text Embedding Benchmark (MTEB) – 56 datasets:

Avg	Class.	Clust.	PairClass.	Rerank.	Retr.	STS	Summ.
55.80	68.04	36.89	81.05	52.60	41.19	79.93	32.06

CodeSearchNet – semantic code search (6 languages):

Go	Ruby	Python	Java	JS	PHP	Avg.
75.38	66.67	96.23	78.99	69.39	73.69	76.73

Multi‑cpr (Chinese multi‑domain retrieval) – E‑commerce MRR@10 0.156, Entertainment video MRR@10 0.149, Medical MRR@10 0.245. For full per‑dataset MTEB breakdowns, see the HuggingFace model page.

UDever training illustration showing the model architecture and contrastive learning setup

best for

·Cross-lingual text embedding and retrieval
·Code search and retrieval across programming languages
·Universal embedding for classification, clustering, and reranking tasks

FAQ

What is Udever BLOOM 560M best used for?

It is a universal embedding model for text and code, supporting tasks like retrieval, classification, clustering, and reranking across multiple natural and programming languages.

How does this model compare to larger Udever variants?

Udever BLOOM 560M has 560M parameters and achieves an average MTEB score of 55.80, while larger variants like the 7B model score 60.63. It is faster and more lightweight.

What is the license for this model?

The model card does not specify a license.

What input format does the model expect?

Queries should be prefixed with [BOQ] and suffixed with [EOQ]; documents with [BOD] and [EOD]. The model uses a decoder-only BLOOM tokenizer.

How can I call this model via the gigarouter API?

Use the gigarouter OpenAI-compatible endpoint with your API key, sending prompts with the required special tokens as input.

not yet live

We're benchmarking and onboarding Udever BLOOM 560M as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related embeddings models

compare all →

nomic-embed-text-v1.5