skip to content
gigarouter gigarouter
models / depth estimation · coming soon

DA3 Giant

depth-anything/DA3-GIANT-1.1

published Dec 2025 · updated Dec 2025

DA3 Giant is a vision transformer model for multi-view depth estimation, camera pose estimation, and 3D Gaussian estimation.

est. price
~$0.235
/ 1k images · estimated, set at launch
API providers
0
downloads / mo
24.2K
license
cc-by-nc-4.0

specs

TaskMulti-view depth estimation, camera pose estimation, 3D Gaussian estimation
ArchitecturePlain transformer with unified depth-ray representation
Parameters1.15B
LicenseCC BY-NC 4.0 (non-commercial only)
TaskMulti-view depth estimation, camera pose estimation, 3D Gaussian estimation
ArchitecturePlain transformer with unified depth-ray representation
Parameters1.15B
LicenseCC BY-NC 4.0 (non-commercial only)

about this model

depth-anything/DA3-GIANT-1.1 is a unified multi-view depth, camera pose, and 3D Gaussian estimation model developed by the ByteDance Seed Team. It uses a single plain transformer (vanilla DINO encoder) with a depth-ray representation, eliminating the need for specialized architectural branches or multi-task learning. The model has 1.15 billion parameters.

Capabilities and Performance

The model accepts any number of input images (with or without known camera poses) and outputs relative depth maps, confidence maps, camera extrinsics and intrinsics, and 3D Gaussians. It is licensed under CC BY-NC 4.0 (non-commercial only).

Depth Anything 3 significantly outperforms Depth Anything 2 for monocular depth estimation and VGGT for multi-view depth and pose estimation. According to the project page, it surpasses VGGT by an average of 35.7% in camera pose accuracy and 23.6% in geometric accuracy. The work has been accepted as an ICLR 2026 Oral.

Extended Use

When deployed in SLAM pipelines (DA3-Long), the model reduces drift in large-scale environments, outperforming COLMAP (which requires >48 hours). A streaming variant (DA3-Streaming) supports ultra-long video sequences with less than 12 GB GPU memory via sliding-window inference.

Training data consists exclusively of public academic datasets. Performance may vary with image quality, lighting, and scene complexity.

best for

FAQ

What tasks does DA3 Giant support?

It supports multi-view depth estimation, camera pose estimation, and 3D Gaussian estimation.

How many parameters does DA3 Giant have?

1.15 billion.

What license applies to DA3 Giant?

CC BY-NC 4.0 license, non-commercial use only.

How can I use DA3 Giant via the gigarouter API?

Send requests to the gigarouter OpenAI-compatible endpoint using your API key. The model accepts a list of images and returns depth maps, confidence maps, camera poses, and intrinsics.

What input and output formats are supported?

Input is a list of images (file paths, PIL Images, or numpy arrays). Output includes depth maps (float32), confidence maps (float32), camera extrinsics (3x4 float32), and intrinsics (3x3 float32). Export formats include glb, npz, ply, and more.

What tasks does DA3 Giant support?

It supports multi-view depth estimation, camera pose estimation, and 3D Gaussian estimation.

How many parameters does DA3 Giant have?

1.15 billion.

What license applies to DA3 Giant?

CC BY-NC 4.0 license, non-commercial use only.

How can I use DA3 Giant via the gigarouter API?

Send requests to the gigarouter OpenAI-compatible endpoint using your API key. The model accepts a list of images and returns depth maps, confidence maps, camera poses, and intrinsics.

What input and output formats are supported?

Input is a list of images (file paths, PIL Images, or numpy arrays). Output includes depth maps (float32), confidence maps (float32), camera extrinsics (3x4 float32), and intrinsics (3x3 float32). Export formats include glb, npz, ply, and more.

not yet live

We're benchmarking and onboarding DA3 Giant as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related depth estimation models

compare all →