ZoeDepth NYU KITTI
Intel/zoedepth-nyu-kitti
published Apr 2024 · updated May 2024
ZoeDepth NYU KITTI is a monocular depth estimation model that predicts metric depth values from a single image, fine-tuned on NYU Depth v2 and KITTI datasets.
specs
| Task | Monocular depth estimation |
| Architecture | DPT-based with metric bins module |
| Parameters | Not specified in card |
| License | MIT |
about this model
ZoeDepth (fine-tuned on NYU and KITTI) is a monocular depth estimation model that outputs metric (absolute) depth values. It extends the DPT architecture with a novel metric bins module, enabling zero-shot transfer to unseen domains while preserving scale accuracy.
Architecture and Training
The model variant ZoeD-M12-NK is pre-trained on 12 datasets for relative depth and fine-tuned jointly on NYU Depth v2 and KITTI for metric depth. During inference, an internal latent classifier routes each input to the appropriate domain-specific head, allowing it to handle both indoor and outdoor scenes without retraining.
Benchmark Performance
On the NYU Depth v2 indoor benchmark, ZoeDepth achieves a 21% improvement in relative absolute error (REL) over the prior state of the art. It is the first model to jointly train on multiple metric depth datasets (NYU and KITTI) without significant performance drop, and demonstrates zero-shot generalization to eight unseen datasets spanning indoor and outdoor environments.
Maintenance Status
The original implementation repository is no longer actively maintained by Intel. As a hosted API on gigarouter, the model is served without requiring local installation or maintenance.
best for
- ·Zero-shot metric depth estimation for indoor scenes
- ·Zero-shot metric depth estimation for outdoor scenes
- ·Joint training on multiple depth datasets without performance drop
FAQ
ZoeDepth extends DPT to predict absolute metric depth values (e.g., meters) instead of relative depth, enabling real-world measurements.
It was fine-tuned on NYU Depth v2 (indoor) and KITTI (outdoor) datasets for metric depth estimation.
The model is released under the MIT license.
Use the gigarouter OpenAI-compatible endpoint with your API key, specifying the model ID Intel/zoedepth-nyu-kitti.
Yes, it demonstrates zero-shot generalization to eight unseen indoor and outdoor datasets without additional training.
We're benchmarking and onboarding ZoeDepth NYU KITTI as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.