DA3 Monocular Large
depth-anything/DA3MONO-LARGE
published Nov 2025 · updated Nov 2025
DA3 Monocular Large is a monocular relative depth estimation model that directly predicts depth with superior geometric accuracy.
specs
| Task | Monocular Depth Estimation |
| Architecture | Plain Transformer (vanilla DINO encoder) with unified depth-ray representation |
| Parameters | 0.35B |
| License | Apache 2.0 |
about this model
DA3MONO-LARGE is a monocular depth estimation model that directly predicts relative depth from a single image, producing geometrically accurate depth maps without requiring disparity-based conversions.
Architecture and Design
Built on a single plain transformer (vanilla DINO encoder), the model uses a unified depth-ray representation that eliminates the need for complex multi-task learning. It has 0.35 billion parameters and is trained exclusively on public academic datasets under the Apache 2.0 license.
Capabilities
- Relative depth estimation with sky segmentation
Performance
Depth Anything 3 significantly outperforms Depth Anything 2 for monocular depth estimation and VGGT for multi-view depth estimation and pose estimation. On standard benchmarks, it surpasses VGGT by 35.7% in camera pose accuracy and 23.6% in geometric accuracy (project page). The paper (arXiv:2511.10647) has been accepted at ICLR 2026 as an Oral presentation.
Limitations
The model is trained on academic datasets, so performance may degrade on domain-specific images or under poor lighting, low image quality, or complex scene conditions.
best for
- ·Relative depth estimation from a single image for scene understanding
- ·Sky segmentation in images
- ·Generating depth maps for 3D reconstruction or novel view synthesis
FAQ
It is designed for high-quality relative monocular depth estimation from a single image, directly predicting depth with superior geometric accuracy compared to disparity-based models.
The model accepts image paths, PIL images, or numpy arrays. Outputs include depth maps, confidence maps, and optionally 3D exports in formats like GLB, NPZ, and PLY.
Use the OpenAI-compatible endpoint with your gigarouter API key. See gigarouter documentation for details on request format and endpoint.
The model is released under Apache 2.0 license, which permits commercial use.
DA3 Monocular Large directly predicts depth (not disparity) and significantly outperforms Depth Anything V2 in monocular depth estimation accuracy.
We're benchmarking and onboarding DA3 Monocular Large as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.