skip to content
gigarouter gigarouter
models / reranker · coming soon

STSB RoBERTa Base

cross-encoder/stsb-roberta-base

published Mar 2022 · updated Apr 2025

STSB RoBERTa Base is a cross-encoder rerank model that scores the semantic similarity between sentence pairs, trained on the STS benchmark dataset.

est. price
~$0.008
/ 1k docs · estimated, set at launch
API providers
0
downloads / mo
182.5K
license
apache-2.0

specs

TaskText Ranking
ArchitectureRoBERTa Base
Parameters124.6M
LicenseApache-2.0
LanguageEnglish

about this model

cross-encoder/stsb-roberta-base is a cross-encoder reranking model that computes a semantic similarity score between two sentences, outputting a value between 0 and 1. It is built on the FacebookAI/roberta-base architecture and was trained on the STS benchmark dataset (sentence-transformers/stsb). The model is designed to evaluate the degree of semantic equivalence between a query and a candidate document, making it well suited for reranking tasks in search and retrieval pipelines.

Key Strengths

  • High precision on semantic similarity: Directly optimized for the STS benchmark, the model provides calibrated similarity scores that align closely with human judgment.
  • Efficient cross-encoder: Processes pairs of texts jointly, allowing deep interaction between the two inputs for more accurate relevance assessment than bi-encoder alternatives.
  • Proven adoption: Over 5.3 million downloads on Hugging Face (as of April 2025) and compatibility with PyTorch, JAX, ONNX, Safetensors, and OpenVINO frameworks.

Model Details

Attribute Value
Parameters 124,646,915
Model file size ~498.6 MB
Language English
License Apache-2.0
Created 2022-03-02

Usage via API

gigarouter hosts cross-encoder/stsb-roberta-base as a managed, OpenAI-compatible API endpoint. Submit pairs of texts and receive a similarity score – no local installation or model loading required. The model is also available in a quantized version for lower latency.

best for

FAQ

What is STSB RoBERTa Base best used for?

It is best used for semantic textual similarity tasks such as reranking search results or identifying duplicate sentences based on a similarity score from 0 to 1.

How many parameters does the model have?

It has approximately 124.6 million parameters.

What license is this model released under?

It is released under the Apache-2.0 license.

What input format does the model expect?

The model expects sentence pairs as input, each pair as two strings. It outputs a similarity score between 0 and 1.

How can I call this model via the gigarouter API?

Use the gigarouter OpenAI-compatible endpoint with your API key, sending the sentence pairs in the request body.

not yet live

We're benchmarking and onboarding STSB RoBERTa Base as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related reranker models

compare all →