rate card
Models & pricing
The specialist models we've benchmarked, hosted and priced — with the long tail we're onboarding next below. Prices are in each model's native unit; realtime is the on-demand rate, batch is a discounted flexible tier (send X-Tier: batch).
allembeddingsspeech-to-textvision-languagezero-shot imagererankerimage-to-texttext-to-speechobject detectiondepth estimationtext generation
52 matches in vision-language · clear
no live models match — see the roadmap below or clear the filter.
| model | task | tier | realtime | batch |
|---|
On the roadmap
52 modelsHigh-demand specialist models with no hosted API. We benchmark and onboard them by task - each has a page; sign in and tell us which you need to jump the queue.
vision-language · 52
Qwen2.5-VL-7B-InstructQwen3.6-35B-A3B-FP8Qwen2.5-VL-3B-Instructgemma-4-26B-A4B-it-AWQ-4bitQwen3.6-27B-FP8Qwen3-VL-4B-InstructQwen2-VL-2B-InstructDeepSeek-OCR-2llava-1.5-7b-hfgemma-4-31B-it-FP8-blockQwen3.6-35B-A3B-Uncensored-HauhauCS-AggressiveFlorence-2-baseQwen3.5-0.8BQwen3-VL-2B-Instructgemma-4-26B-A4B-it-FP8-DynamicQwen3.6-35B-A3B-AWQ-4bitQwen2-VL-7B-InstructQwen2-VL-7B-Instruct-AWQQwen3.6-27B-MTP-GGUFQwen2.5-VL-7B-Instruct-AWQmoondream2gemma-4-26B-A4B-it-GGUFInternVL2-2BQwythos-9B-Claude-Mythos-5-1M-GGUFchandra-ocr-2Unlimited-OCRQwen3.6-35B-A3B-GGUFQwen3.6-35B-A3B-MTP-GGUFblip2-opt-2.7bQwen3.6-40B-Claude-4.6-Opus-Deckard-Heretic-Uncensored-Thinking-NEO-CODE-Di-IMatrix-MAX-GGUFdots.mocrsurya-ocr-2dots.ocrQianfan-OCRLightOnOCR-2-1Bchandragranite-vision-4.1-4bGemma4-12B-QAT-Uncensored-HauhauCS-BalancedMinerU2.5-Pro-2605-1.2BQwopus3.6-35B-A3B-Coder-MTP-GGUFGemma4-26B-A4B-QAT-Uncensored-HauhauCS-Balanced-MTPUnlimited-OCR-GGUFMinerU2.5-2509-1.2BInfinity-Parser2-FlashUI-Venus-1.5-8BUI-TARS-2B-SFTKDL-Frontier-Parser-nanoGTA1-7BUI-Venus-1.5-2BUI-TARS-7B-SFTUI-Venus-Ground-7BUI-AGILE-3B