models / image segmentation · coming soon

BiRefNet Lite

ZhengPeng7/BiRefNet_lite

published Aug 2024 · updated Feb 2026

BiRefNet Lite is a lightweight segmentation model for high-resolution dichotomous (binary) image segmentation, extracting precise foreground objects.

est. price

~$0.047

/ 1k images · estimated, set at launch

API providers

downloads / mo

24.8K

specs

Task	Dichotomous Image Segmentation
Architecture	BiRefNet with Swin Transformer Tiny backbone
License	MIT

about this model

ZhengPeng7/BiRefNet_lite is a dichotomous image segmentation model for high-resolution images, based on a bilateral reference architecture. It achieves state-of-the-art performance on three tasks: dichotomous image segmentation (DIS), high-resolution salient object detection (HRSOD), and camouflaged object detection (COD). The model was trained on the DIS-TR dataset and validated on DIS-TEs and DIS-VD.

Architecture and Training

The architecture uses a bilateral reference mechanism that captures both global and local context for high-resolution inputs. The "lite" variant employs a Swin Transformer tiny backbone, balancing accuracy and computational efficiency. Recent upgrades include the adoption of PyTorch's SDPA for the Swin attention, reducing memory usage and enabling potential acceleration, as well as an 8× speedup in the refine_foreground step to approximately 80 ms on an NVIDIA 5090 GPU.

Performance

The model achieves state-of-the-art results across DIS, HRSOD, and COD benchmarks. The paper is published in CAAI Artificial Intelligence Research (2024, vol. 3, p. 9150038, DOI 10.26599/AIR.2024.9150038). Example segmentation outputs are shown below:

DIS-Sample 1	DIS-Sample 2

Additional Capabilities

The model also supports box-guided segmentation, enabling interactive refinement of results. An online GUI demo is available on Hugging Face Spaces for testing with adjustable input resolutions.

Screenshot of the BiRefNet demo interface

For further details, refer to the GitHub repository with full source code, documentation, and a model zoo.

best for

·Background removal from high-resolution images
·Object extraction for e-commerce product photos
·Precise binary masks for compositing or editing

FAQ

What is dichotomous image segmentation?

It is binary segmentation that separates foreground objects from the background at high resolution, producing a single-channel mask.

How can I call BiRefNet Lite via the gigarouter API?

Use the OpenAI-compatible endpoint with your API key, sending an image URL or base64 data and receiving a segmentation mask in response.

What is the input and output format?

Input: an RGB image (any size, resized to 1024x1024 internally). Output: a single-channel probability mask that can be thresholded to a binary mask.

Is the model free to use under the MIT license?

Yes, the model is released under the MIT license, allowing commercial and personal use with attribution.

How fast is BiRefNet Lite compared to the full BiRefNet?

BiRefNet Lite uses a tiny Swin Transformer backbone, making it faster and more memory-efficient than the full model while maintaining strong segmentation quality.

not yet live

We're benchmarking and onboarding BiRefNet Lite as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related image segmentation models