skip to content
gigarouter gigarouter
models / object detection · coming soon

RT-DETR R50

PekingU/rtdetr_r50vd

published May 2024 · updated Feb 2025

RT-DETR R50 is a real-time end-to-end object detection model that eliminates NMS, achieving high speed and accuracy with a ResNet-50 backbone.

est. price
~$0.047
/ 1k images · estimated, set at launch
API providers
0
downloads / mo
63.7K
license
apache-2.0

specs

TaskObject Detection
ArchitectureRT-DETR with ResNet-50 backbone and efficient hybrid encoder
Parameters42 million
Training DataCOCO 2017 (118k training images)
LicenseApache-2.0

about this model

RT-DETR R50vd is a real-time end-to-end object detection model that uses a Transformer-based architecture to eliminate the need for Non-Maximum Suppression (NMS), achieving high accuracy with low latency. It was accepted to CVPR 2024.

Architecture

The model employs an efficient hybrid encoder that decouples intra-scale interaction (via attention) and cross-scale fusion (via CNN) to process multi-scale features from the backbone. Uncertainty-minimal query selection provides high-quality initial queries to the decoder, and the decoder can be adjusted (number of layers) to flexibly trade off speed and accuracy without retraining.

Architecture diagram of RT-DETR showing backbone, encoder, and decoder with query selection

Performance

On COCO val2017, RT-DETR-R50 achieves 53.1% AP at 108 FPS on a T4 GPU (batch size 1), outperforming previous YOLO detectors in both speed and accuracy. It surpasses DINO-R50 by 2.2% AP and approximately 21× in FPS. With Objects365 pre-training, the same model reaches 55.3% AP.

Benchmark Results (COCO val2017)

Model#EpochsParams (M)GFLOPsFPS (bs=1)APAP50AP75AP-sAP-mAP-l
RT-DETR-R18722060.721746.563.850.428.449.863.0
RT-DETR-R34723191.017248.566.252.330.251.966.2
RT-DETR-R50724213610853.171.357.734.858.070.0
RT-DETR-R10172762597454.372.758.636.058.872.1
RT-DETR-R18 (Obj365)60206121749.266.653.533.252.364.8
RT-DETR-R50 (Obj365)244213610855.373.460.137.959.971.8
RT-DETR-R101 (Obj365)24762597456.274.661.338.360.573.5
Training hyperparameters table showing batch size, learning rate schedule, and data augmentation details Comparison plot of RT-DETR versus YOLO and DETR models on speed versus accuracy trade-off

best for

FAQ

What is RT-DETR R50 best used for?

It excels at real-time object detection (108 FPS on T4) with high accuracy (53.1 AP on COCO), making it ideal for latency-sensitive applications like autonomous driving or surveillance.

How does RT-DETR R50 compare to YOLO detectors?

RT-DETR R50 outperforms YOLOv8 in both speed and accuracy on COCO, and eliminates the need for NMS post-processing, simplifying the pipeline.

What are the input and output formats?

It accepts images (typically resized to 640x640) and returns bounding boxes, class labels, and confidence scores.

How can I call RT-DETR R50 via the gigarouter API?

Use the gigarouter OpenAI-compatible endpoint with your API key; send an image as base64 or URL and receive detection results in JSON.

Is RT-DETR R50 free to use?

Yes, the model is released under the Apache-2.0 license, allowing commercial and research use with attribution.

not yet live

We're benchmarking and onboarding RT-DETR R50 as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related object detection models

compare all →