BiRefNet HR
ZhengPeng7/BiRefNet_HR
published Feb 2025 · updated Feb 2026
BiRefNet HR is a segmentation model for high-resolution dichotomous image segmentation (DIS), using bilateral reference to separate foreground from background.
specs
| Task | Dichotomous Image Segmentation (foreground/background) |
| Architecture | BiRefNet with bilateral reference (Swin Transformer backbone) |
| License | MIT |
| Input Resolution | 2048x2048 (trained and optimized for high resolution) |
about this model
ZhengPeng7/BiRefNet_HR is a high-resolution dichotomous image segmentation (DIS) model that refines the Bilateral Reference (BiRefNet) architecture to process native 2048x2048 resolution inputs, achieving state‑of‑the‑art results on DIS, high‑resolution salient object detection (HRSOD), and camouflaged object detection (COD) tasks.
Key Strengths
The model was trained exclusively on 2048x2048 images, enabling precise boundary prediction for fine‑grained segmentation. Under the hood, the Swin Transformer attention has been upgraded to PyTorch’s official scaled dot‑product attention (SDPA) for reduced memory cost, and the refine‑foreground step is accelerated 8× via a GPU‑based fast‑fg‑est implementation (approx. 80 ms on an NVIDIA 5090). The model is released under the MIT license.
Performance on DIS‑VD (FP16)
| Method | Resolution | maxFm | wFmeasure | MAE | Smeasure | meanEm | HCE | maxEm | meanFm | adpEm | adpFm | mBA | maxBIoU | meanBIoU |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| BiRefNet_HR (epoch 130) | 2048×2048 | .925 | .894 | .026 | .927 | .952 | 811 | .960 | .909 | .944 | .888 | .828 | .837 | .817 |
| BiRefNet_HR (epoch 130) | 1024×1024 | .876 | .840 | .041 | .893 | .913 | 1348 | .926 | .860 | .930 | .857 | .765 | .769 | .742 |
| BiRefNet (epoch 244) | 2048×2048 | .888 | .858 | .037 | .898 | .934 | 811 | .941 | .878 | .927 | .862 | .802 | .790 | .776 |
| BiRefNet (epoch 244) | 1024×1024 | .908 | .877 | .034 | .912 | .943 | 1128 | .953 | .894 | .944 | .881 | .796 | .812 | .789 |
Sample Results
The model is published in CAAI Artificial Intelligence Research (2024, 3: 9150038) and is hosted as a managed API on gigarouter, requiring no local installation.
best for
- ·High-resolution foreground object segmentation (e.g., product images, portraits)
- ·Dichotomous image segmentation for datasets like DIS-VD, DIS-TE
- ·High-resolution salient object detection (HRSOD) and camouflaged object detection (COD)
FAQ
It excels at high-resolution dichotomous image segmentation, separating foreground from background at resolutions up to 2048x2048. It also achieves SOTA on HRSOD and COD tasks.
Input is an image (preferably at 2048x2048) normalized to ImageNet mean/std. Output is a single-channel probability map (sigmoid) indicating foreground mask.
The model is released under the MIT license.
Use the gigarouter OpenAI-compatible endpoint with your API key. Send the image as a URL or base64 payload and receive the segmentation mask in the response.
The HR variant is trained at 2048x2048 resolution and achieves higher accuracy on high-resolution inputs compared to the standard BiRefNet (trained at 1024x1024).
We're benchmarking and onboarding BiRefNet HR as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.