Distill Any Depth Large
xingyang1/Distill-Any-Depth-Large-hf
published Mar 2025 · updated Mar 2025
Distill Any Depth Large is a monocular depth estimation model that uses novel knowledge distillation methods to achieve state-of-the-art zero-shot depth prediction.
specs
| Task | Depth Estimation |
| Architecture | DepthAnythingForDepthEstimation |
| Parameters | 335M |
| License | MIT |
about this model
Distill-Any-Depth-Large-hf is a monocular depth estimation model that performs zero-shot depth estimation using knowledge distillation. Built on the DepthAnythingForDepthEstimation architecture with 335 million parameters and released under the MIT license, it was introduced in the paper Distill Any Depth: Distillation Creates a Stronger Monocular Depth Estimator (arXiv:2502.19204).
Technical approach
The model employs Cross-Context Distillation, which integrates global and local depth cues to improve pseudo-label quality, and an assistant-guided distillation strategy that incorporates complementary priors from a diffusion-based teacher model. This multi-teacher framework leverages the strengths of different depth estimation models, including generative diffusion-based depth models, to enhance supervision diversity and robustness.
Performance
Quantitative and qualitative evaluations on benchmark datasets show that Distill-Any-Depth-Large-hf significantly outperforms prior state-of-the-art methods such as MiDaS v3.1, DepthAnythingv2, Marigold, and Genpercept. The model produces finer granularity and more detailed depth estimates, particularly in challenging regions.
Hosted on gigarouter as a managed, OpenAI-compatible API, this model requires no local installation or hardware management, enabling seamless integration into depth estimation pipelines.
best for
- ·Zero-shot depth estimation in diverse real-world scenes
- ·Robotics and autonomous navigation
- ·3D scene reconstruction from a single image
- ·Augmented and virtual reality applications
FAQ
It uses cross-context distillation and assistant-guided distillation from a diffusion-based teacher to produce finer, more detailed depth maps with better generalization.
Use the gigarouter OpenAI-compatible endpoint with your API key, sending an image URL or base64-encoded image to the depth estimation endpoint.
Input: an image (URL or base64). Output: a depth map as a 2D array or image (depth values normalized between 0 and 255).
It is released under the MIT license.
It has 335 million parameters and a file size of approximately 1.34 GB (float32), making it a large but efficient model for high-quality depth estimation.
We're benchmarking and onboarding Distill Any Depth Large as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.