Depth Anything Base
LiheYoung/depth-anything-base-hf
published Jan 2024 · updated Jan 2024
Depth Anything Base is a depth model that performs robust monocular depth estimation on any image using a DPT architecture with a DINOv2 backbone, trained on over 62 million images.
specs
| Task | Monocular Depth Estimation |
| Architecture | DPT with DINOv2 backbone |
| Training Data | 62M+ unlabeled images + 1.5M labeled images |
| License | Apache 2.0 |
about this model
LiheYoung/depth-anything-base-hf is a monocular depth estimation model that combines a DINOv2 backbone with the DPT architecture to produce high-quality depth maps from a single RGB image.
Architecture and training
The model leverages a Vision Transformer (DINOv2) encoder and a DPT decoder, a design that yields up to 28% relative improvement over fully convolutional networks for depth estimation. It was trained on a combination of 1.5 million labeled images and over 62 million unlabeled images using a teacher-student pipeline, where the teacher generates pseudo-labels and the student is trained with strong augmentations to improve robustness.
Performance
Depth Anything achieves state-of-the-art results on both relative and absolute depth estimation benchmarks, demonstrating strong generalization across diverse scenes and lighting conditions. The paper was accepted at CVPR 2024.

This model is hosted as a managed, OpenAI-compatible API on gigarouter, allowing developers to integrate monocular depth estimation into their applications without managing infrastructure.
best for
- ·Zero-shot depth estimation on arbitrary images
- ·Depth condition input for ControlNet in image generation
- ·Video depth estimation and visualization
FAQ
It uses a DPT decoder with a DINOv2 image encoder backbone.
It was trained on 1.5 million labeled images and over 62 million unlabeled images.
The model is released under the Apache 2.0 license.
Send an image URL to the OpenAI-compatible endpoint with your API key to receive a depth map.
Yes, the official repository provides a script for video depth visualization.
We're benchmarking and onboarding Depth Anything Base as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.