Question 1

What is Depth Anything Large best for?

Accepted Answer

It excels at zero-shot monocular depth estimation on any image, including in-the-wild photos, and can be fine-tuned for metric depth on datasets like NYUv2 and KITTI.

Question 2

How does Depth Anything Large compare to MiDaS?

Accepted Answer

It outperforms MiDaS v3.1 (BEiT L-512) in zero-shot relative depth estimation and achieves better results when used for depth-conditioned ControlNet.

Question 3

What are the input and output formats for the model?

Accepted Answer

Input is an RGB image (resized to 518x518, normalized); output is a depth map with inverse relative depth values per pixel.

Question 4

How can I use Depth Anything Large via the gigarouter API?

Accepted Answer

Call the OpenAI-compatible endpoint with your API key, sending an image URL or base64-encoded image; the API returns the depth map as a tensor or image.

Question 5

Does Depth Anything Large support video depth estimation?

Accepted Answer

Yes, the official repository includes a script for video depth visualization (run_video.py), and the model can be applied frame-by-frame.

Task	Monocular Depth Estimation
Architecture	Vision Transformer (ViT-L) with DPT head
Parameters	~335 million
Training Data	1.5M labeled images + 62M+ unlabeled images

Depth Anything Large

specs

about this model

Key Strengths

Zero-Shot Evaluation

best for

FAQ

related depth estimation models