skip to content
gigarouter gigarouter
tasks / vision-language

Hosted vision-language models

52 models · 0 live as APIs · benchmarked & compared

Vision-language models process both images and text, enabling tasks such as extracting structured data from scanned documents, answering questions about photographs, and generating captions for accessibility. For example, deepseek-ai/DeepSeek-OCR-2 is specialised for optical character recognition, while series like Qwen/Qwen2.5-VL-7B-Instruct and Qwen/Qwen2-VL-2B-Instruct support visual question answering and image-to-text generation.

  • Document digitisation and invoice parsing
  • Automated content moderation on visual platforms
  • Visual search and retrieval-augmented generation (RAG) pipelines

In production, these models are often integrated into RAG workflows or multimodal chatbots. Choosing among the 32 models listed here involves balancing latency, accuracy, and cost: larger architectures such as Qwen/Qwen3.6-35B-A3B-FP8 yield higher quality on complex reasoning but require more compute, while quantised or smaller models like cyankiwi/gemma-4-26B-A4B-it-AWQ-4bit or Qwen/Qwen3-VL-4B-Instruct serve well at lower throughputs. For most call volumes, calling a hosted API eliminates infrastructure overhead and enables elastic scaling — benefits gigarouter provides through its benchmarked, OpenAI-compatible endpoints. (Currently 0 models are live; the remainder are being onboarded.)

compare

modelparamsdownloads/mopricestatus
Qwen/Qwen2.5-VL-7B-Instruct8292.2M9.8M~$1.341 / 1k imagescoming soon
Qwen/Qwen3.6-35B-A3B-FP835953.9M6.2M~$1.341 / 1k imagescoming soon
Qwen/Qwen2.5-VL-3B-Instruct3754.6M5.3M~$0.626 / 1k imagescoming soon
cyankiwi/gemma-4-26B-A4B-it-AWQ-4bit26554.3M5.1M~$1.341 / 1k imagescoming soon
Qwen/Qwen3.6-27B-FP827782.9M4.9M~$1.341 / 1k imagescoming soon
Qwen/Qwen3-VL-4B-Instruct4437.8M3.7M~$1.341 / 1k imagescoming soon
Qwen/Qwen2-VL-2B-Instruct2209M3.6M~$0.626 / 1k imagescoming soon
deepseek-ai/DeepSeek-OCR-23389.1M3.3M~$0.626 / 1k imagescoming soon
llava-hf/llava-1.5-7b-hf7063.4M3.2M~$1.341 / 1k imagescoming soon
RedHatAI/gemma-4-31B-it-FP8-block31274.9M3.2M~$1.341 / 1k imagescoming soon
HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive-3Mat launchcoming soon
microsoft/Florence-2-base231.6M2.6M~$0.094 / 1k imagescoming soon
Qwen/Qwen3.5-0.8B873.4M2.5M~$0.235 / 1k imagescoming soon
Qwen/Qwen3-VL-2B-Instruct2127.5M2.1M~$0.626 / 1k imagescoming soon
RedHatAI/gemma-4-26B-A4B-it-FP8-Dynamic26560.9M2M~$1.341 / 1k imagescoming soon
cyankiwi/Qwen3.6-35B-A3B-AWQ-4bit35951.8M1.8M~$1.341 / 1k imagescoming soon
Qwen/Qwen2-VL-7B-Instruct8291.4M1.8M~$1.341 / 1k imagescoming soon
Qwen/Qwen2-VL-7B-Instruct-AWQ8291.4M1.8M~$1.341 / 1k imagescoming soon
unsloth/Qwen3.6-27B-MTP-GGUF-1.8Mat launchcoming soon
Qwen/Qwen2.5-VL-7B-Instruct-AWQ8292.2M1.7M~$1.341 / 1k imagescoming soon
vikhyatk/moondream21927.2M1.6M~$0.626 / 1k imagescoming soon
unsloth/gemma-4-26B-A4B-it-GGUF-1.5Mat launchcoming soon
OpenGVLab/InternVL2-2B2205.8M1.5M~$0.626 / 1k imagescoming soon
empero-ai/Qwythos-9B-Claude-Mythos-5-1M-GGUF-1.4Mat launchcoming soon
datalab-to/chandra-ocr-25295.6M1.3M~$1.341 / 1k imagescoming soon
baidu/Unlimited-OCR3336.1M885K~$0.626 / 1k imagescoming soon
unsloth/Qwen3.6-35B-A3B-GGUF-874.6Kat launchcoming soon
unsloth/Qwen3.6-35B-A3B-MTP-GGUF-734.7Kat launchcoming soon
Salesforce/blip2-opt-2.7b3744.8M669.8K~$0.626 / 1k imagescoming soon
DavidAU/Qwen3.6-40B-Claude-4.6-Opus-Deckard-Heretic-Uncensored-Thinking-NEO-CODE-Di-IMatrix-MAX-GGUF-519.4Kat launchcoming soon
rednote-hilab/dots.mocr3039.2M518.9K~$0.626 / 1k imagescoming soon
datalab-to/surya-ocr-2686.2M407K~$0.235 / 1k imagescoming soon
rednote-hilab/dots.ocr3039.2M278.6K~$0.626 / 1k imagescoming soon
baidu/Qianfan-OCR4741.4M258.6K~$1.341 / 1k imagescoming soon
lightonai/LightOnOCR-2-1B1005.6M170.5K~$0.235 / 1k imagescoming soon
datalab-to/chandra8767.1M138.4K~$1.341 / 1k imagescoming soon
ibm-granite/granite-vision-4.1-4b3997.2M111K~$0.626 / 1k imagescoming soon
HauhauCS/Gemma4-12B-QAT-Uncensored-HauhauCS-Balanced-71.7Kat launchcoming soon
opendatalab/MinerU2.5-Pro-2605-1.2B1156M48.8K~$0.235 / 1k imagescoming soon
Jackrong/Qwopus3.6-35B-A3B-Coder-MTP-GGUF-44.8Kat launchcoming soon
HauhauCS/Gemma4-26B-A4B-QAT-Uncensored-HauhauCS-Balanced-MTP-44.5Kat launchcoming soon
sahilchachra/Unlimited-OCR-GGUF-43.7Kat launchcoming soon
opendatalab/MinerU2.5-2509-1.2B1156M21.2K~$0.235 / 1k imagescoming soon
infly/Infinity-Parser2-Flash2213.2M16.6K~$0.626 / 1k imagescoming soon
inclusionAI/UI-Venus-1.5-8B8767.1M5K~$1.341 / 1k imagescoming soon
ByteDance-Seed/UI-TARS-2B-SFT2442.4M2.5K~$0.626 / 1k imagescoming soon
KDLAI/KDL-Frontier-Parser-nano1156M2.3K~$0.235 / 1k imagescoming soon
Salesforce/GTA1-7B8292.2M1.3K~$1.341 / 1k imagescoming soon
inclusionAI/UI-Venus-1.5-2B2438.7M946~$0.626 / 1k imagescoming soon
ByteDance-Seed/UI-TARS-7B-SFT8291.4M737~$1.341 / 1k imagescoming soon
inclusionAI/UI-Venus-Ground-7B8292.2M231~$1.341 / 1k imagescoming soon
KDEGroup/UI-AGILE-3B3754.6M5~$0.626 / 1k imagescoming soon