CLIP ViT B 16 DataComp.XL s13B b90K
laion/CLIP-ViT-B-16-DataComp.XL-s13B-b90K
published May 2023 · updated Sep 2023
A popular open zero-shot image model, with 37.3K downloads a month. gigarouter benchmarks and hosts it as an OpenAI-compatible API.
about this model
CLIP ViT-B-16-DataComp.XL-s13B-b90K is a zero-shot image classification model that performs image-text retrieval and classification without task-specific fine-tuning, using a Vision Transformer (ViT-B/16) architecture trained on the DataComp-1B dataset with OpenCLIP.
Model Overview
This model was trained on 1.4 billion image-text pairs from the DataComp-1B dataset, a large-scale, uncurated corpus crawled from public internet sources. The training procedure and model architecture follow the standard CLIP approach, enabling the model to compute alignment scores between images and arbitrary text prompts. This design supports zero-shot classification, image retrieval, and text retrieval across a wide range of domains without requiring labeled training data for each new task.
Performance
On the ImageNet-1k benchmark, the model achieves a zero-shot top-1 accuracy of 73.5%. Evaluation was conducted across a suite of 38 diverse downstream datasets using the DataComp evaluation pipeline and the LAION CLIP Benchmark. These results position the model as a strong general-purpose zero-shot image classifier, particularly suited for tasks where fine-tuning is not feasible or where rapid adaptation to new categories is required.
Hosted API
Gigarouter hosts this model as a managed, OpenAI-compatible API. Developers can call the endpoint with an image and a set of candidate text labels to receive classification scores, or perform image-text similarity searches. The API eliminates the need for local infrastructure, CUDA dependencies, or model management, providing direct access to the model’s zero-shot capabilities.
We're benchmarking and onboarding CLIP ViT B 16 DataComp.XL s13B b90K as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.