BiRefNet
ZhengPeng7/BiRefNet
published Jul 2024 · updated Feb 2026
BiRefNet is a segmentation model for high-resolution dichotomous image segmentation (DIS).
specs
| Task | Dichotomous Image Segmentation (DIS) |
| Architecture | Bilateral Reference Framework (Localization Module + Reconstruction Module with bilateral reference) |
| License | MIT |
| Training Data | DIS-TR dataset |
about this model
BiRefNet is a dichotomous image segmentation model designed for high-resolution input, introduced in the paper "Bilateral Reference for High-Resolution Dichotomous Image Segmentation" (CAAI AIR 2024, DOI 10.26599/AIR.2024.9150038). It achieves state-of-the-art performance on dichotomous image segmentation (DIS), high-resolution salient object detection (HRSOD), and camouflaged object detection (COD).
Architecture and Capabilities
BiRefNet employs a bilateral reference framework comprising a localization module (LM) for object localization using global semantic information and a reconstruction module (RM) that leverages bilateral references: hierarchical image patches as source reference and gradient maps as target reference. Auxiliary gradient supervision is applied to improve fine-detail regions. The model was trained on the DIS-TR dataset and validated on DIS-TEs and DIS-VD.
Notable Performance and Optimizations
- As of June 2025, the
refine_foregroundoperation was accelerated by approximately 8× to ~80 ms on an NVIDIA 5090 GPU using a GPU implementation of fast-fg-est. - The Swin Transformer attention module was upgraded to PyTorch’s official SDPA (September 2025), reducing memory cost and offering potential speedups during training and inference.
- The model has been updated to version v7 (arXiv 2401.03407, July 2025) with a corrected figure layout.
Visual Examples
| DIS Sample 1 | DIS Sample 2 |
|---|---|
![]() |
![]() |
The image below shows the online inference GUI demo on Hugging Face, which supports adjustable resolutions.
Additional Resources
Source code, weights, and a model zoo are available on the GitHub repository. The full paper provides further details on the bilateral reference design and experimental benchmarks.
best for
- ·High-resolution object segmentation
- ·Background removal and matting
- ·Fine-detail segmentation (e.g., hair, fur)
FAQ
BiRefNet performs high-resolution dichotomous image segmentation to separate precise objects from backgrounds, with strong performance on fine details.
The default inference resolution is 1024x1024, though it can be adapted to other sizes.
BiRefNet is released under the MIT license.
Use the OpenAI-compatible endpoint with your gigarouter API key; refer to the gigarouter documentation for endpoint details.
BiRefNet achieves state-of-the-art results on dichotomous image segmentation (DIS) benchmarks and also performs well on high-resolution saliency object detection (HRSOD) and camouflaged object detection (COD).
We're benchmarking and onboarding BiRefNet as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

