skip to content
gigarouter gigarouter
models / object detection · coming soon

D-FINE Small (COCO)

ustc-community/dfine-small-coco

published Feb 2025 · updated May 2025

D-FINE Small (COCO) is a detection model that redefines bounding box regression as fine-grained distribution refinement for real-time object detection.

est. price
~$0.047
/ 1k images · estimated, set at launch
API providers
0
downloads / mo
4.5K
license
apache-2.0

specs

TaskObject Detection
ArchitectureD-FINE (Transformer-based DETR)
Training DatasetCOCO 2017
InputImages
OutputBounding boxes, scores, and class labels
FrameworkHugging Face Transformers (PyTorch)

about this model

ustc-community/dfine-small-coco is a real-time object detection model that redefines bounding box regression using Fine-grained Distribution Refinement (FDR) and Global Optimal Localization Self-Distillation (GO-LSD). Developed as part of the D-FINE family, it achieves high localization precision with minimal added parameters, making it suitable for latency-sensitive applications.

Key Strengths

  • Enhanced bounding box regression via fine-grained distribution refinement, improving localization accuracy over traditional DETR approaches.
  • Global optimal localization self-distillation boosts performance without extra inference cost.
  • The method improves a range of DETR models by up to 5.3% AP with negligible additional parameters and training overhead.

Benchmark Results

Evaluated on COCO val2017, the D-FINE family delivers competitive speed and accuracy on an NVIDIA T4 GPU (TensorRT FP16):

Model VariantAPFPS
D-FINE-L54.0%124
D-FINE-X55.8%78
D-FINE-L (Objects365 pretraining)57.1%124
D-FINE-X (Objects365 pretraining)59.3%78

The dfine-small-coco variant is trained on COCO train2017. Its performance aligns with the D-FINE family’s trade-off between accuracy and speed.

COCO benchmark comparison of D-FINE variants against other real-time detectors

This model is hosted as a managed, OpenAI-compatible API on gigarouter, enabling direct integration without infrastructure overhead.

best for

FAQ

What is D-FINE Small (COCO) best for?

Real-time object detection in dynamic environments such as autonomous driving, surveillance, and robotics.

How does the small variant compare to larger D-FINE models?

The small variant is faster and lighter, suitable for edge devices. Larger variants like D-FINE-L achieve 54.0% AP at 124 FPS on an NVIDIA T4 GPU.

What are the input and output formats?

Input: images (PIL or tensor). Output: bounding boxes, confidence scores, and class labels.

How can I call this model via the API?

Use the gigarouter OpenAI-compatible endpoint with an API key to send image inputs and receive detection results.

What license is this model under?

The model card does not specify a license; check the Hugging Face repository for details.

not yet live

We're benchmarking and onboarding D-FINE Small (COCO) as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related object detection models

compare all →