rate card
Models & pricing
The specialist models we've benchmarked, hosted and priced — with the long tail we're onboarding next below. Prices are in each model's native unit; realtime is the on-demand rate, batch is a discounted flexible tier (send X-Tier: batch).
| model | task | tier | realtime | batch |
|---|---|---|---|---|
| cross-encoder/ms-marco-MiniLM-L6-v2 | reranker | A | $0.008/1k docs | $0.0025/1k docs |
| jinaai/jina-reranker-v2-base-multilingual | reranker | A | $0.008/1k docs | $0.0025/1k docs |
| Qwen/Qwen3-Reranker-0.6B | reranker | A | $0.008/1k docs | $0.0025/1k docs |
| Qwen/Qwen3-Embedding-0.6B | embeddings | A | $0.008/1M tok | $0.0025/1M tok |
| BAAI/bge-small-en-v1.5 | embeddings | A | $0.008/1M tok | $0.0025/1M tok |
| BAAI/bge-reranker-base | reranker | A | $0.008/1k docs | $0.0025/1k docs |
| Qwen/Qwen3-Embedding-4B | embeddings | A | $0.008/1M tok | $0.0025/1M tok |
On the roadmap
509 modelsHigh-demand specialist models with no hosted API. We benchmark and onboard them by task - each has a page; sign in and tell us which you need to jump the queue.
embeddings · 88
nomic-embed-text-v1.5nomic-embed-text-v1w2v-bert-2.0all-MiniLM-L6-v2jina-embeddings-v3granite-embedding-small-english-r2bge-base-en-v1.5wavlm-largeall-MiniLM-L6-v2-onnxjina-embeddings-v2-small-engte-multilingual-basegte-large-en-v1.5Qwen3-VL-Embedding-8Bjina-embeddings-v5-text-nanoSFR-Embedding-2_Rnomic-embed-text-v2-moeindobert-base-p1gte-Qwen2-1.5B-instructwavlm-base-plusbm25llama-nemotron-embed-1b-v2Qwen3-Embedding-4B-W4A16-G128gte-base-en-v1.5e5-mistral-7b-instructjina-embeddings-v2-base-ensnowflake-arctic-embed-m-v2.0LLM2Vec-Meta-Llama-3-8B-Instruct-mntp-supervisedgte-Qwen2-7B-instructjina-embeddings-v5-omni-smallstella_en_400M_v5jina-clip-v1e5-vsnowflake-arctic-embed-m-longgranite-embedding-english-r2F2LLM-v2-4Bstella_en_1.5B_v5cde-small-v2jina-embeddings-v5-omni-nanoLLM2Vec-Mistral-7B-Instruct-v2-mntp-unsup-simcseNV-Embed-v2SFR-Embedding-MistralLinq-Embed-Mistralgme-Qwen2-VL-2B-Instructmmlw-e5-largeinf-retriever-v1-1.5bF2LLM-v2-1.7Bjina-embedding-b-en-v1MiniCPM-Embeddingopensearch-neural-sparse-encoding-doc-v3-gteLCO-Embedding-Omni-3BLCO-Embedding-Omni-7Bbilingual-embedding-basegme-Qwen2-VL-7B-Instructbge-en-iclF2LLM-v2-8BKaLM-embedding-multilingual-mini-instruct-v2KaLM-embedding-multilingual-mini-instruct-v1jina-embedding-s-en-v1BidirLM-Omni-2.5B-Embeddingnomic-embed-text-v1-unsupervisedLLM2Vec-Meta-Llama-3-8B-Instruct-mntp-unsup-simcseLLM2Vec-Sheared-LLaMA-mntp-supervisedmmlw-e5-baseinf-retriever-v1udever-bloom-560mudever-bloom-1b1LLM2Vec-Mistral-7B-Instruct-v2-mntp-supervisednomic-embed-text-v1-ablatedlodestone-base-4096-v1Ivysaurembedder-100pslx-v0.1b1ade-embedudever-bloom-3bBulbasaurcde-small-v1gte-micro-v4Zeta-Alpha-E5-MistralDenseOn-unsupervisedLENS-d8000LENS-d4000ColBERT-Zero-supervisedColBERT-Zero-unsupervisedgte-Qwen1.5-7B-instructcadet-embed-base-v1LLM2Vec-Llama-2-7b-chat-hf-mntp-unsup-simcseLLM2Vec-Llama-2-7b-chat-hf-mntp-supervisedstatic-retrieval-mrl-en-v1
speech-to-text · 53
speaker-diarization-3.1whisperkit-coremlwhisper-basewav2vec2-large-xlsr-53-japanesewav2vec2-large-xlsr-53-polishwav2vec2-large-xlsr-53-dutchwav2vec2-indonesian-javanese-sundanesespeaker-diarization-community-1wav2vec2-large-xlsr-53-arabicwav2vec2-large-xlsr-53-hungarianwhisper-smallmms-300m-1130-forced-alignerwav2vec2-large-xlsr-53-portuguesewav2vec2-large-xlsr-53-russianromanian-wav2vec2wav2vec2-large-xlsr-53-teluguwav2vec2-large-xlsr-53-persianwav2vec2-large-voxrex-swedishwav2vec2-large-xls-r-300m-UrduWav2Vec2-large-xlsr-hindivoice-activity-detectionVoxtral-Mini-4B-Realtime-2602wav2vec2-xls-r-300m-hebrewwav2vec2-xls-r-300m-mixedwav2vec2-large-xlsr-53-thwhisper-tinywav2vec2-large-xlsr-53-chinese-zh-cnparakeet-tdt-0.6b-v2wav2vec2-xls-r-300m-bengalifaster-whisper-baseQwen3-ASR-1.7BQwen3-ASR-0.6Bparakeet-ctc-1.1bPhi-4-multimodal-instructGLM-ASR-Nano-2512whisper-large-v2whisper-medium.enwhisper-small.enmoonshine-baseparakeet-rnnt-0.6bwhisper-largewhisper-base.enparakeet-ctc-0.6bmoonshine-streaming-mediummoonshine-streaming-smallcanary-1b-flashdistil-large-v3.5parakeet-rnnt-1.1bARK-ASR-3BARK-ASR-0.6BMOSS-Transcribe-preview-2Bpingala-v1-universalstt-2.6b-en
vision-language · 52
Qwen2.5-VL-7B-InstructQwen3.6-35B-A3B-FP8Qwen2.5-VL-3B-Instructgemma-4-26B-A4B-it-AWQ-4bitQwen3.6-27B-FP8Qwen3-VL-4B-InstructQwen2-VL-2B-InstructDeepSeek-OCR-2llava-1.5-7b-hfgemma-4-31B-it-FP8-blockQwen3.6-35B-A3B-Uncensored-HauhauCS-AggressiveFlorence-2-baseQwen3.5-0.8BQwen3-VL-2B-Instructgemma-4-26B-A4B-it-FP8-DynamicQwen3.6-35B-A3B-AWQ-4bitQwen2-VL-7B-InstructQwen2-VL-7B-Instruct-AWQQwen3.6-27B-MTP-GGUFQwen2.5-VL-7B-Instruct-AWQmoondream2gemma-4-26B-A4B-it-GGUFInternVL2-2BQwythos-9B-Claude-Mythos-5-1M-GGUFchandra-ocr-2Unlimited-OCRQwen3.6-35B-A3B-GGUFQwen3.6-35B-A3B-MTP-GGUFblip2-opt-2.7bQwen3.6-40B-Claude-4.6-Opus-Deckard-Heretic-Uncensored-Thinking-NEO-CODE-Di-IMatrix-MAX-GGUFdots.mocrsurya-ocr-2dots.ocrQianfan-OCRLightOnOCR-2-1Bchandragranite-vision-4.1-4bGemma4-12B-QAT-Uncensored-HauhauCS-BalancedMinerU2.5-Pro-2605-1.2BQwopus3.6-35B-A3B-Coder-MTP-GGUFGemma4-26B-A4B-QAT-Uncensored-HauhauCS-Balanced-MTPUnlimited-OCR-GGUFMinerU2.5-2509-1.2BInfinity-Parser2-FlashUI-Venus-1.5-8BUI-TARS-2B-SFTKDL-Frontier-Parser-nanoGTA1-7BUI-Venus-1.5-2BUI-TARS-7B-SFTUI-Venus-Ground-7BUI-AGILE-3B
zero-shot image · 49
clip-vit-base-patch32clip-vit-large-patch14CLIP-ViT-B-32-laion2B-s34B-b79Kclip-vit-large-patch14-336PickScore_v1fashion-clipsiglip-so400m-patch14-384clip-vit-base-patch16siglip2-giant-opt-patch16-384siglip-base-patch16-224siglip2-base-patch16-naflexsiglip2-so400m-patch16-naflexsiglip2-so400m-patch14-384marqo-fashionSigLIPCLIP-convnext_base_w-laion2B-s13B-b82K-augregBiomedCLIP-PubMedBERT_256-vit_base_patch16_224siglip2-so400m-patch16-256CLIP-ViT-H-14-laion2B-s32B-b79Ksiglip2-base-patch16-224CLIP-ViT-B-16-laion2B-s34B-b88KCLIP-ViT-L-14-laion2B-s32B-b82Ksiglip2-so400m-patch16-512TinyCLIP-ViT-8M-16-Text-3M-YFCC15MPE-Core-L14-336ViT-SO400M-14-SigLIP-384siglip2-base-patch16-256chinese-clip-vit-base-patch16one-alignMobileCLIP-S2-OpenCLIPViT-B-16-SigLIP2-256clip-vit-base-patch32siglip2-base-patch16-512CLIP-convnext_large_d_320.laion2B-s29B-b131K-ft-soupViT-SO400M-14-SigLIPAltCLIPViT-B-16-SigLIPsiglip2-large-patch16-256CLIP-ViT-bigG-14-laion2B-39B-b160kCLIP-ViT-L-14-DataComp.XL-s13B-b90KCLIP-ViT-B-32-DataComp.XL-s13B-b90Ksiglip-large-patch16-384CLIP-ViT-B-16-DataComp.XL-s13B-b90Ksiglip-large-patch16-256CLIP-ViT-g-14-laion2B-s34B-b88Ksiglip-base-patch16-256siglip-base-patch16-256-multilingualsiglip-base-patch16-384align-basesiglip-base-patch16-512
text-to-speech · 37
XTTS-v2Qwen3-TTS-12Hz-1.7B-CustomVoiceQwen3-TTS-12Hz-0.6B-CustomVoiceMOSS-TTSOmniVoiceF5-TTSindic-parler-ttsQwen3-TTS-12Hz-1.7B-VoiceDesignVoxCPM2VibeVoice-Realtime-0.5BKokoro-82M-v1.0-ONNXQwen3-TTS-12Hz-0.6B-Bases2-procsm-1bVibeVoice-1.5BMOSS-TTS-v1.5higgs-tts-2-3b-basemms-tts-engMeloTTS-EnglishVieNeu-TTS-v3-Turbohf-seamless-m4t-mediumneutts-nanoE2-TTShiggs-tts-3-4bF5-TTS_RUSSIAN3b-de-ft-research_releaseMOSS-TTS-Nano-100Mspeecht5_ttsKimi-Audio-7B-InstructVieNeu-TTS-v2Voxtral-4B-TTS-2603svara-tts-v1MeloTTS-SpanishMeloTTS-Koreansupertonic-3higgs-audio-v3-tts-4b-transformerssarashina2.2-tts
image-to-text · 37
blip-image-captioning-baseblip-image-captioning-largePP-OCRv5_server_detNuExtract3UVDoctrocr-small-handwrittenPP-LCNet_x1_0_doc_orimanga-ocr-basegranite-vision-3.3-2bPP-LCNet_x1_0_textline_oritrocr-base-printedLightOnOCR-1B-1025PP-OCRv5_server_rectrocr-large-handwrittenkosmos-2-patch14-224donut-basetrocr-base-stage1nougat-basetrocr-large-printedPP-OCRv5_mobile_dettrocr-base-handwrittenmgp-str-basePP-OCRv6_medium_detPP-OCRv6_medium_recPP-OCRv5_mobile_recmeiki.txt.recognition.v0vit-gpt2-image-captioninglatin_PP-OCRv5_mobile_rectrocr-small-printednougat-smallGLM-OCRNuMarkdown-8B-Thinkingen_PP-OCRv4_mobile_recPP-DocLayout_plus-LPP-OCRv4_mobile_detPP-DocBlockLayoutFalcon-OCR
object detection · 36
table-transformer-structure-recognitiontable-transformer-detectionyolos-smallPP-DocLayoutV3_safetensorsrtdetr_v2_r50vdrtdetr_r50vd_coco_o365table-transformer-structure-recognition-v1.1-alldetr-doc-table-detectionyolov8m-table-extractionyolos-tinyrtdetr_r101vd_coco_o365rtdetr_v2_r18vdAnzhcs_YOLOsrtdetr_r50vdstockmarket-pattern-detection-yolov8yolov11-license-plate-detectionyolov5m-license-plateyolos-fashionpediaconditional-detr-resnet-50rtdetr_r18vd_coco_o365YOLOv8YOLO-Face-Person-DetectorYOLO11rtdetr_r18vdYOLO26deformable-detryolov8_animefacedetr-resnet-101-dc5rtdetr_v2_r101vddetr-resnet-50conditional-detr-50-signature-detectorbest-comic-panel-detectionlocate-anything.cpp-ggufdfine-small-cocoyoloeyolo11-document-layout
depth estimation · 36
Depth-Anything-V2-Small-hfDA3METRIC-LARGEdepth-anything-large-hfdpt-hybrid-midasDA3NESTED-GIANT-LARGE-1.1Depth-Anything-V2-Large-hfDistill-Any-Depth-Large-hfDA3-LARGEzoedepth-nyu-kittiDA3-GIANTmarigold-depth-v1-0dpt-largeDA3-LARGE-1.1DA3NESTED-GIANT-LARGEdepth-anything-base-hfDA3MONO-LARGEDepth-Anything-V2-LargeDA3-SMALLdepth-anything-v2-smallDA3-GIANT-1.1DepthPro-hfdepth-anything-small-hfDepth-Anything-V2-Metric-Indoor-Small-hfdepth_anything_vitl14Depth-Anything-V2-SmallDepth-Anything-V2-Metric-Indoor-Large-hflotus-depth-g-v1-0DepthCrafterDepth-Anything-V2-Basedepth-anything-small-hfDepthPromarigold-depth-v1-1Depth-Anything-V2-Metric-Outdoor-Large-hfdepth_anything_vits14depth-anything.cpp-ggufmarigold-depth-lcm-v1-0
text generation · 34
opt-125mgpt2tiny-Qwen2ForCausalLM-2.5deepseek-v4-ggufQwen3.6-35B-A3B-NVFP4gemma-3-270mdolphin-2.9.1-yi-1.5-34bgemma-4-12B-coder-fable5-composer2.5-v1-GGUFVLM2Vec-Fullgemma-4-12B-agentic-fable5-composer2.5-v2-3.5x-tau2-GGUFOrnith-1.0-35B-GGUFOrnith-1.0-9B-GGUFGLM-5.2-GGUFQwen-AgentWorld-35B-A3B-GGUFDeepSeek-V4-Flash-GGUFOrnith-1.0-35BGLM-5.2-NVFP4Qwen3.6-27B-NVFP4Ornith-1.0-397B-FP8Ornith-1.0-9BOrnith-1.0-35B-FP8Huihui-Qwythos-9B-Claude-Mythos-5-1M-abliterated-GGUFQwen-AgentWorld-35B-A3BDeepSeek-V4-Flash-DSparkLFM2.5-230MOrnith-1.0-35B-MTP-APEX-GGUFOrnith-1.0-9B-MTP-GGUFNemotron-Labs-TwoTower-30B-A3B-Base-BF16DeepSeek-V4-Pro-DSparkOrnith-1.0-35B-AEON-Ultimate-Uncensored-NVFP4Ornith-1.0-397BHuihui-GLM-5.2-abliterated-GGUFAgents-A1VLM2Vec-LoRA
reranker · 34
ms-marco-MiniLM-L4-v2gte-reranker-modernbert-basems-marco-MiniLM-L12-v2Qwen3-Reranker-4Bmmarco-mMiniLMv2-L12-H384-v1ms-marco-MiniLM-L2-v2Qwen3-Reranker-8Bjina-reranker-v3mxbai-rerank-xsmall-v1Qwen3-VL-Reranker-8Bjapanese-reranker-cross-encoder-small-v1Qwen3-VL-Reranker-2Bstsb-roberta-largems-marco-TinyBERT-L2-v2ruri-v3-reranker-310mmxbai-rerank-base-v1Qwen3-Reranker-0.6B-seq-clsllama-nemotron-rerank-1b-v2gte-multilingual-reranker-basecrossencoder-camembert-base-mmarcoFRstsb-roberta-basellama-nemotron-rerank-vl-1b-v2mxbai-rerank-base-v2stsb-distilroberta-basebge-reranker-basemxbai-rerank-large-v1japanese-reranker-cross-encoder-xsmall-v1mxbai-rerank-large-v2ctxl-rerank-v2-instruct-multilingual-1bjapanese-reranker-xsmall-v2jina-reranker-v1-turbo-enzerank-2-rerankerQwen3-Reranker-0.6B-Q8_0-GGUFqnli-electra-base
specialist model · 20
electra-base-discriminatoradetailercolbertv2.0contrieverwespeaker-voxceleb-resnet34-LMunidepth-v2-vitl14stable-diffusion-v1-5-archiveAnimaQwen3.5-9B-Uncensored-HauhauCS-AggressiveVoxtral-Mini-3B-2507blip-itm-base-cocoBugTraceAI-CORE-Ultra-27B-Q6EditAnythingblip-itm-large-cocoInfinity-Parser-7Bblip-itm-base-flickrebind-fullblip-itm-large-flickrSupertron-embedding-190Mvoyage-3-m-exp
image segmentation · 9
time series forecasting · 3
visual document retrieval · 3
mask generation · 2
image generation · 2
text-gen · 2
image classification · 1
audio classification · 1
fill mask · 1
voice activity detection · 1
image-to-video · 1
image feature extraction · 1
text-to-video · 1
audio text to text · 1
grounding · 1