Depth Pro

apple/DepthPro

published Oct 2024 · updated Feb 2025

Depth Pro is a depth model that produces sharp, metric monocular depth maps with absolute scale, without needing camera intrinsics, in under a second.

status

coming soon

API providers

downloads / mo

4.7K

license

apple-amlr

specs

Task	Monocular Depth Estimation
Architecture	Multi-scale Vision Transformer
License	Custom Apple License

about this model

Depth Pro is a foundation model for zero-shot metric monocular depth estimation that produces high-resolution depth maps with absolute scale, without requiring camera intrinsics or other metadata. It generates a 2.25-megapixel depth map in 0.3 seconds on a V100 GPU. Depth Pro demo image showing depth estimation output

Depth Pro demo image showing depth estimation output

The model uses an efficient multi-scale vision transformer for dense prediction and is trained on a combination of real and synthetic datasets to achieve both metric accuracy and fine boundary detail. It also performs state-of-the-art focal length estimation from a single image.

Key capabilities

Zero-shot metric depth with absolute scale, no camera intrinsics needed
High-resolution depth maps with sharp boundaries and high-frequency detail
Fast inference: 0.3 seconds for a 2.25-megapixel image on a V100 GPU
Integrated focal length estimation from a single image

Benchmark performance

The model was published at ICLR 2025. In qualitative comparisons, Depth Pro outperforms prior work including Marigold, Depth Anything v2, and Metric3D v2 on boundary sharpness and detail preservation. The reference implementation in this repository has been re-trained and its performance is close to, but does not exactly match, the model reported in the paper.

Evaluation metrics

The model introduces dedicated boundary metrics for depth accuracy. These include scale-invariant boundary F1 for depth-based datasets and scale-invariant boundary recall for mask-based datasets (image matting or segmentation).

Additional details

Code and weights are released under a custom Apple license.
Example images for boundary and sharpness comparisons use the AM-2k and DIS-5k datasets.
For full technical details, see the paper: Depth Pro: Sharp Monocular Metric Depth in Less Than a Second (arXiv 2410.02073, v2 revised April 2025).

best for

·High-resolution depth maps for AR/VR applications
·Real-time metric depth from a single image without camera intrinsics
·Focal length estimation from a single image

FAQ

What is Depth Pro best used for?

Depth Pro excels at zero-shot metric depth estimation from a single image, producing sharp, high-resolution depth maps with absolute scale and fine boundary details.

How fast is Depth Pro?

It produces a 2.25-megapixel depth map in 0.3 seconds on a V100 GPU.

What license does Depth Pro use?

The code and model weights are released under a custom Apple License, not a standard open-source license.

What input format does Depth Pro accept?

It accepts a single RGB image; the model also optionally takes focal length in pixels (if available) to improve depth accuracy.

How can I call Depth Pro via the API?

Use the gigarouter OpenAI-compatible endpoint with an API key; send the image as input and receive the depth map and focal length as output.

not yet live

We're benchmarking and onboarding Depth Pro as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related depth estimation models

compare all →

Depth-Anything-V2-Small-hf

1.7M dl/mo

DA3METRIC-LARGE

825K dl/mo

depth-anything-large-hf

388.9K dl/mo

dpt-hybrid-midas

225.1K dl/mo

DA3NESTED-GIANT-LARGE-1.1

199.9K dl/mo

Depth-Anything-V2-Large-hf

199.1K dl/mo