skip to content
gigarouter gigarouter
models / object detection · coming soon

YOLOS Small

hustvl/yolos-small

published Apr 2022 · updated May 2024

YOLOS Small is a detection model that uses a Vision Transformer with DETR loss to predict bounding boxes and class labels for objects in images.

est. price
~$0.047
/ 1k images · estimated, set at launch
API providers
0
downloads / mo
713.6K
license
apache-2.0

specs

TaskObject Detection
ArchitectureVision Transformer (ViT)
Parameters30.7M
AP on COCO val36.1
LicenseApache-2.0

about this model

YOLOS-small is an object detection model that applies a Vision Transformer (ViT) backbone to the object detection task, trained with the DETR bipartite matching loss and fine-tuned on COCO 2017.

Architecture and training

The model uses a vanilla ViT architecture with minimal spatial priors. It is pre-trained on ImageNet-1k for 200 epochs and then fine-tuned on COCO 2017 object detection (118k training images) for 150 epochs. Detection is performed by adding 100 learnable detection tokens to the input sequence; the model predicts class and bounding box for each token. The Hungarian algorithm matches predictions to ground truth, and the loss combines cross-entropy for classes with L1 and generalized IoU for boxes.

Performance

On COCO 2017 validation, YOLOS-small achieves an average precision (AP) of 36.1. For comparison, the larger YOLOS-base variant reaches 42.0 AP, matching more complex frameworks such as Faster R-CNN and DETR. The model contains 30.7 million parameters and is released under the Apache-2.0 license.

References

best for

FAQ

What is YOLOS Small?

It is a small Vision Transformer model fine-tuned on COCO for object detection, using a bipartite matching loss like DETR.

How does YOLOS Small compare to YOLOS-Base?

YOLOS Small has 30.7M parameters and achieves 36.1 AP on COCO val, while YOLOS-Base achieves 42.0 AP with more parameters.

What license does YOLOS Small use?

It is released under the Apache-2.0 license.

What are the input and output formats?

Input: an image. Output: predicted bounding boxes and corresponding COCO class labels.

How can I call YOLOS Small via the gigarouter API?

Use the OpenAI-compatible endpoint with your API key, sending an image URL or base64-encoded image.

not yet live

We're benchmarking and onboarding YOLOS Small as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related object detection models

compare all →