skip to content
gigarouter gigarouter
models / depth estimation · coming soon

Depth Anything V2 Large

depth-anything/Depth-Anything-V2-Large-hf

published Jun 2024 · updated Jul 2024

Depth Anything V2 Large is a monocular depth estimation model that predicts dense depth maps from single images using a DPT architecture with a DINOv2 backbone.

est. price
~$0.094
/ 1k images · estimated, set at launch
API providers
0
downloads / mo
199.1K
license
cc-by-nc-4.0

specs

TaskMonocular Depth Estimation
ArchitectureDPT (Dense Prediction Transformer) with DINOv2 backbone
Training Data~600K synthetic labeled images + ~62M real unlabeled images

about this model

Depth-Anything-V2-Large is a monocular depth estimation model that produces fine-grained, robust depth predictions from a single image. It is trained on approximately 595K synthetic labeled images and over 62 million real unlabeled images, using a DPT architecture with a DINOv2 backbone.

Key strengths

  • Delivers more detailed and robust depth maps than Depth Anything V1.
  • Outperforms Stable Diffusion–based models (e.g., Marigold, Geowizard) in accuracy while being over 10× faster and more lightweight.
  • Strong zero-shot generalization across diverse scenes and domains.
  • Pre-trained weights enable impressive fine-tuned performance on metric depth estimation (e.g., NYUv2, KITTI).

Performance

On standard benchmarks, the model achieves state-of-the-art results for both relative and absolute depth estimation. The V2 pipeline replaces all labeled real images with synthetic data, scales the teacher model capacity, and uses large-scale pseudo-labeled real images for student training, yielding finer and more robust predictions than V1.

Depth Anything overview diagram from the original paper

Availability

gigarouter hosts this model as a managed, OpenAI-compatible API. No local setup, pip installs, or GPU management required — simply call the API endpoint to run inference.

best for

FAQ

What is Depth Anything V2 Large best used for?

It is designed for robust monocular depth estimation in zero-shot settings, producing finer and more robust depth predictions than V1 and SD-based models.

How does Depth Anything V2 Large compare to Depth Anything V1?

V2 produces much finer and more robust depth predictions by using synthetic labeled images, a larger teacher model, and large-scale pseudo-labeled real images.

What input and output format does the model expect?

Input: a single RGB image. Output: a depth map (same spatial resolution as the input after interpolation).

How can I call this model via the gigarouter API?

Use the gigarouter OpenAI-compatible endpoint with your API key; send an image URL or base64-encoded image and receive a depth map as response.

What is the license for Depth Anything V2 Large?

The model card does not specify a license; refer to the original repository for licensing details.

not yet live

We're benchmarking and onboarding Depth Anything V2 Large as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related depth estimation models

compare all →