skip to content
gigarouter gigarouter
models / depth estimation · coming soon

Depth Anything 3 Nested Giant-Large

depth-anything/DA3NESTED-GIANT-LARGE-1.1

published Dec 2025 · updated Dec 2025

Depth Anything 3 Nested Giant-Large is a depth model that combines the any-view Giant model with the metric Large model for metric-scale visual geometry reconstruction.

est. price
~$0.626
/ 1k images · estimated, set at launch
API providers
0
downloads / mo
199.9K
license
cc-by-nc-4.0

specs

TaskDepth Estimation, Pose Estimation, 3D Reconstruction
ArchitecturePlain transformer with unified depth-ray representation
Parameters1.40B
LicenseCC BY-NC 4.0 (non-commercial only)
Training DataPublic academic datasets

about this model

DA3NESTED-GIANT-LARGE is a depth estimation model that recovers spatially consistent metric-scale geometry from arbitrary visual inputs, combining the any-view Giant model with the metric Large model in a nested architecture.

With 1.40B parameters, it supports relative depth, metric depth, pose estimation, pose conditioning, 3D Gaussian estimation, and sky segmentation. The model uses a single plain transformer with a unified depth-ray representation, eliminating the need for complex multi-task learning.

Depth Anything 3 significantly outperforms prior state-of-the-art models. On the project page benchmarks, it surpasses VGGT by an average of 35.7% in camera pose accuracy and 23.6% in geometric accuracy, and outperforms Depth Anything 2 for monocular depth estimation. The model has been accepted at ICLR 2026 as an Oral presentation.

Key capabilities demonstrated in the research include feed-forward 3D Gaussians estimation for novel view synthesis, and DA3-Long which reduces drift in large-scale SLAM, matching or exceeding COLMAP (which requires over 48 hours). The DA3-Streaming variant handles ultra-long video sequences with less than 12GB GPU memory via sliding-window inference.

This model is developed by the ByteDance Seed Team and is licensed under CC BY-NC 4.0 — non-commercial use only.

best for

FAQ

What is this model best for?

It excels at metric-scale visual geometry reconstruction from any views, including depth, pose, and 3D Gaussians.

How does it compare to Depth Anything 2 and VGGT?

It significantly outperforms Depth Anything 2 on monocular depth and VGGT on multi-view depth and pose estimation.

What are the license terms?

CC BY-NC 4.0 – non-commercial use only.

What input formats does it accept?

Image paths, PIL Images, or numpy arrays. Output includes depth maps, confidence maps, camera extrinsics/intrinsics, and 3D exports (e.g., GLB, PLY).

How do I call this model via the gigarouter API?

Use the gigarouter OpenAI-compatible endpoint with your API key. Refer to the gigarouter documentation for the exact endpoint and request format.

not yet live

We're benchmarking and onboarding Depth Anything 3 Nested Giant-Large as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related depth estimation models

compare all →