skip to content
gigarouter gigarouter
models / depth estimation · coming soon

ZoeDepth NYU KITTI

Intel/zoedepth-nyu-kitti

published Apr 2024 · updated May 2024

ZoeDepth NYU KITTI is a monocular depth estimation model that predicts metric depth values from a single image, fine-tuned on NYU Depth v2 and KITTI datasets.

est. price
~$0.094
/ 1k images · estimated, set at launch
API providers
0
downloads / mo
138.2K
license
mit

specs

TaskMonocular depth estimation
ArchitectureDPT-based with metric bins module
ParametersNot specified in card
LicenseMIT

about this model

ZoeDepth (fine-tuned on NYU and KITTI) is a monocular depth estimation model that outputs metric (absolute) depth values. It extends the DPT architecture with a novel metric bins module, enabling zero-shot transfer to unseen domains while preserving scale accuracy.

Architecture and Training

The model variant ZoeD-M12-NK is pre-trained on 12 datasets for relative depth and fine-tuned jointly on NYU Depth v2 and KITTI for metric depth. During inference, an internal latent classifier routes each input to the appropriate domain-specific head, allowing it to handle both indoor and outdoor scenes without retraining.

Diagram of ZoeDepth architecture showing metric bins heads and latent classifier

Benchmark Performance

On the NYU Depth v2 indoor benchmark, ZoeDepth achieves a 21% improvement in relative absolute error (REL) over the prior state of the art. It is the first model to jointly train on multiple metric depth datasets (NYU and KITTI) without significant performance drop, and demonstrates zero-shot generalization to eight unseen datasets spanning indoor and outdoor environments.

Maintenance Status

The original implementation repository is no longer actively maintained by Intel. As a hosted API on gigarouter, the model is served without requiring local installation or maintenance.

best for

FAQ

What is the difference between ZoeDepth and relative depth models like DPT?

ZoeDepth extends DPT to predict absolute metric depth values (e.g., meters) instead of relative depth, enabling real-world measurements.

What datasets was this model fine-tuned on?

It was fine-tuned on NYU Depth v2 (indoor) and KITTI (outdoor) datasets for metric depth estimation.

What is the license for this model?

The model is released under the MIT license.

How do I call this model via the gigarouter API?

Use the gigarouter OpenAI-compatible endpoint with your API key, specifying the model ID Intel/zoedepth-nyu-kitti.

Does this model support zero-shot generalization?

Yes, it demonstrates zero-shot generalization to eight unseen indoor and outdoor datasets without additional training.

not yet live

We're benchmarking and onboarding ZoeDepth NYU KITTI as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related depth estimation models

compare all →