Marigold Depth LCM

prs-eth/marigold-depth-lcm-v1-0

published Mar 2024 · updated May 2025

Marigold Depth LCM is a generative latent diffusion model for monocular depth estimation from a single image, fine-tuned using latent consistency distillation for fast inference.

status

coming soon

API providers

downloads / mo

2.5K

license

apache-2.0

specs

Task	Monocular Depth Estimation
Architecture	Latent Diffusion Model (LCM variant fine-tuned from Stable Diffusion 2)
License	Apache 2.0
Steps	1-4 denoising steps with LCM scheduler
Effective Resolution	~768 pixels (longer side)

about this model

Marigold Depth LCM v1-0 is a monocular depth estimation model that produces affine-invariant depth maps from a single RGB image. It is fine-tuned from the Marigold Depth v1-0 checkpoint using latent consistency distillation (LCM), enabling inference with the LCM scheduler in 1–4 denoising steps. The model operates in the compressed latent space of Stable Diffusion’s pretrained VAE and inherits an effective resolution of approximately 768 pixels on the longer side for optimal predictions. Outputs include a depth map with values normalized between 0 and 1, and an uncertainty map when ensemble size exceeds 2.

Performance

As reported in the CVPR 2024 paper (Oral, Best Paper Award Candidate), Marigold achieves state-of-the-art zero-shot generalization across a wide range of datasets, with performance gains exceeding 20% in specific cases. The model is developed by researchers at ETH Zurich (PRS-ETH) and released under the Apache 2.0 license.

Deprecation Notice

This model (v1-0) is deprecated. The newer Marigold Depth v1-1 checkpoint is recommended for improved results. For further technical details, see the project website, journal extension, and official codebase.

best for

·Generating depth maps for 3D reconstruction from single images
·Depth-based image editing and compositing
·Augmenting robot perception with fast depth estimation

FAQ

What is this model best for?

Fast monocular depth estimation with 1-4 denoising steps, suitable for real-time or near-real-time applications.

How does it compare to the original Marigold Depth v1-0?

It uses LCM distillation for much faster inference (1-4 steps vs 10-50) while maintaining similar accuracy.

What license is it under?

Apache 2.0.

What input formats does it accept?

Single images; the model processes them at an effective resolution of ~768 pixels on the longer side.

How do I call it via the API?

Use the gigarouter OpenAI-compatible endpoint with an API key; send an image as a base64-encoded string or URL.

not yet live

We're benchmarking and onboarding Marigold Depth LCM as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related depth estimation models

compare all →

Depth-Anything-V2-Small-hf

1.7M dl/mo

DA3METRIC-LARGE

825K dl/mo

depth-anything-large-hf

388.9K dl/mo

dpt-hybrid-midas

225.1K dl/mo

DA3NESTED-GIANT-LARGE-1.1

199.9K dl/mo

Depth-Anything-V2-Large-hf

199.1K dl/mo