rate card

Models & pricing

The specialist models we've benchmarked, hosted and priced — with the long tail we're onboarding next below. Prices are in each model's native unit; realtime is the on-demand rate, batch is a discounted flexible tier (send X-Tier: batch).

all embeddings speech-to-text vision-language zero-shot image reranker image-to-text text-to-speech object detection depth estimation text generation

52 matches in vision-language · clear

no live models match — see the roadmap below or clear the filter.

model	task	tier	realtime	batch

On the roadmap

52 models

High-demand specialist models with no hosted API. We benchmark and onboard them by task - each has a page; sign in and tell us which you need to jump the queue.

vision-language · 52

Qwen2.5-VL-7B-Instruct Qwen3.6-35B-A3B-FP8 Qwen2.5-VL-3B-Instruct gemma-4-26B-A4B-it-AWQ-4bit Qwen3.6-27B-FP8 Qwen3-VL-4B-Instruct Qwen2-VL-2B-Instruct DeepSeek-OCR-2 llava-1.5-7b-hf gemma-4-31B-it-FP8-block Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive Florence-2-base Qwen3.5-0.8B Qwen3-VL-2B-Instruct gemma-4-26B-A4B-it-FP8-Dynamic Qwen3.6-35B-A3B-AWQ-4bit Qwen2-VL-7B-Instruct Qwen2-VL-7B-Instruct-AWQ Qwen3.6-27B-MTP-GGUF Qwen2.5-VL-7B-Instruct-AWQ moondream2 gemma-4-26B-A4B-it-GGUF InternVL2-2B Qwythos-9B-Claude-Mythos-5-1M-GGUF chandra-ocr-2 Unlimited-OCR Qwen3.6-35B-A3B-GGUF Qwen3.6-35B-A3B-MTP-GGUF blip2-opt-2.7b Qwen3.6-40B-Claude-4.6-Opus-Deckard-Heretic-Uncensored-Thinking-NEO-CODE-Di-IMatrix-MAX-GGUF dots.mocr surya-ocr-2 dots.ocr Qianfan-OCR LightOnOCR-2-1B chandra granite-vision-4.1-4b Gemma4-12B-QAT-Uncensored-HauhauCS-Balanced MinerU2.5-Pro-2605-1.2B Qwopus3.6-35B-A3B-Coder-MTP-GGUF Gemma4-26B-A4B-QAT-Uncensored-HauhauCS-Balanced-MTP Unlimited-OCR-GGUF MinerU2.5-2509-1.2B Infinity-Parser2-Flash UI-Venus-1.5-8B UI-TARS-2B-SFT KDL-Frontier-Parser-nano GTA1-7B UI-Venus-1.5-2B UI-TARS-7B-SFT UI-Venus-Ground-7B UI-AGILE-3B