skip to content
gigarouter gigarouter
models / object detection · coming soon

RT-DETRv2 R18

PekingU/rtdetr_v2_r18vd

published Jan 2025 · updated Feb 2025

RT-DETRv2 R18 is a real-time object detection transformer model optimized for fast and accurate detection in dynamic environments.

est. price
~$0.047
/ 1k images · estimated, set at launch
API providers
0
downloads / mo
97.1K
license
apache-2.0

specs

TaskObject Detection
ArchitectureDetection Transformer (RT-DETRv2-S with ResNet-18 backbone)
Parameters20 million
LicenseApache 2.0
COCO AP48.1
FPS (T4 TensorRT fp16)217

about this model

RT-DETRv2 r18vd is a real-time object detection transformer model that builds on the RT-DETR architecture with selective multi-scale feature extraction, a discrete sampling operator for broader deployment compatibility, and improved training strategies including dynamic data augmentation and scale-adaptive hyperparameters. Designed for speed and accuracy, it achieves state-of-the-art real-time detection while maintaining flexibility across edge and cloud environments.

Key improvements

The decoder uses a distinct number of sampling points per feature scale in deformable attention, enabling selective multi-scale feature extraction. The optional discrete sampling operator replaces grid_sample, removing deployment constraints common to DETRs. Training enhancements (dynamic augmentation and scale-adaptive hyperparameters) improve performance without compromising inference speed. The model is pretrained on COCO train2017 and evaluated on COCO val2017.

Performance

On COCO val2017 at 640 input resolution, the r18vd variant (RT-DETRv2-S) achieves 48.1 AP and 65.1 AP50 with 20 million parameters and 217 FPS on a single T4 GPU (TensorRT fp16, batch size 1). This represents a +1.6 mAP improvement over the previous RT-DETR-R18 baseline. A discrete sampling variant (RT-DETRv2-S_dsp) scores 47.4 AP and 64.8 AP50, trading slight accuracy for wider deployment compatibility.

Graph comparing RT-DETRv2 performance against prior models across AP and FPS
Variant AP AP50 Params FPS (T4, fp16)
RT-DETRv2-S (r18vd) 48.1 65.1 20M 217
RT-DETRv2-S_dsp 47.4 64.8 20M ~217

FPS measured on a single T4 GPU with TensorRT >= 8.5.1, fp16 precision, batch size 1. The model is released under the Apache 2.0 license.

best for

FAQ

What is RT-DETRv2 R18 best used for?

It excels at real-time object detection in applications like autonomous driving, surveillance, robotics, and retail analytics.

How many parameters does this model have?

It has 20 million parameters.

What license is the model released under?

Apache 2.0 license.

What input format does the model require?

The model expects images preprocessed with RTDetrImageProcessor (e.g., resized and normalized).

How can I call this model via the gigarouter API?

Use the OpenAI-compatible endpoint with your API key, sending an image URL or base64-encoded image.

not yet live

We're benchmarking and onboarding RT-DETRv2 R18 as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related object detection models

compare all →