Depth Anything 3 Giant

depth-anything/DA3-GIANT

published Nov 2025 · updated Nov 2025

Depth Anything 3 Giant is a depth model that performs multi-view depth estimation, camera pose estimation, and 3D Gaussian estimation from arbitrary views.

est. price

~$0.235

/ 1k images · estimated, set at launch

API providers

downloads / mo

83.4K

license

cc-by-nc-4.0

specs

Task	Multi-view depth estimation, camera pose estimation, 3D Gaussian estimation
Architecture	Plain transformer with unified depth-ray representation
Parameters	1.15B
License	CC BY-NC 4.0 (non-commercial only)

about this model

depth-anything/DA3-GIANT is a multi-view depth estimation model that predicts spatially consistent geometry from arbitrary visual inputs, with or without known camera poses, while also enabling camera pose estimation and 3D Gaussian estimation.

Developed by the ByteDance Seed team and accepted as an oral at ICLR 2026, DA3-GIANT is a 1.15B-parameter plain vision transformer that uses a unified depth-ray representation, eliminating the need for complex multi-task learning. The model is trained exclusively on public academic datasets and is released under a CC BY-NC 4.0 license (non-commercial use only).

Key capabilities

Relative depth estimation from single or multiple images
Camera pose (extrinsic and intrinsic) estimation
3D Gaussian prediction and export (GLB, PLY, NPZ, etc.)
Pose conditioning for controlled geometry

Benchmark performance

According to the project page, DA3-GIANT surpasses the prior state-of-the-art VGGT by an average of 35.7% in camera pose accuracy and 23.6% in geometric accuracy. It also significantly outperforms Depth Anything 2 for monocular depth estimation. Simply replacing VGGT in VGGT-Long with DA3 (DA3-Long) reduces drift in large-scale environments, even outperforming COLMAP (which requires 48+ hours to complete).

Model series

The DA3 family includes DA3-GIANT, DA3-Large, DA3-Base, DA3-Small, and specialized variants: DA3Metric-Large (metric depth), DA3Mono-Large (monocular), DA3Nested-Giant-Large, and DA3-Streaming (for ultra-long video sequences using under 12GB GPU memory).

Limitations

The model may show reduced performance on domain-specific images, and results depend on image quality, lighting, and scene complexity. The CC BY-NC 4.0 license restricts use to non-commercial applications.

For detailed benchmarks and the paper, see the project page and arXiv:2511.10647.

best for

·Multi-view depth estimation from unordered image sets
·Camera pose estimation for 3D reconstruction
·3D Gaussian splatting from images

FAQ

What input format does the API accept?

The API accepts a list of image paths, PIL Images, or numpy arrays.

Is this model free for commercial use?

No, it is licensed under CC BY-NC 4.0 for non-commercial use only.

How does Depth Anything 3 Giant compare to Depth Anything 2?

It significantly outperforms Depth Anything 2 for monocular depth estimation.

How do I call this model via the gigarouter API?

Use the gigarouter OpenAI-compatible endpoint with your API key.

What is the output format?

Output includes depth maps, confidence maps, camera extrinsics and intrinsics as numpy arrays, and optionally GLB/PLY files.

not yet live

We're benchmarking and onboarding Depth Anything 3 Giant as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related depth estimation models

compare all →

Depth-Anything-V2-Small-hf

1.7M dl/mo

DA3METRIC-LARGE

825K dl/mo

depth-anything-large-hf

388.9K dl/mo

dpt-hybrid-midas

225.1K dl/mo

DA3NESTED-GIANT-LARGE-1.1

199.9K dl/mo

Depth-Anything-V2-Large-hf

199.1K dl/mo