rate card
Models & pricing
The specialist models we've benchmarked, hosted and priced — with the long tail we're onboarding next below. Prices are in each model's native unit; realtime is the on-demand rate, batch is a discounted flexible tier (send X-Tier: batch).
allembeddingsspeech-to-textvision-languagezero-shot imagererankerimage-to-texttext-to-speechobject detectiondepth estimationtext generation
37 matches in image-to-text · clear
no live models match — see the roadmap below or clear the filter.
| model | task | tier | realtime | batch |
|---|
On the roadmap
37 modelsHigh-demand specialist models with no hosted API. We benchmark and onboard them by task - each has a page; sign in and tell us which you need to jump the queue.
image-to-text · 37
blip-image-captioning-baseblip-image-captioning-largePP-OCRv5_server_detNuExtract3UVDoctrocr-small-handwrittenPP-LCNet_x1_0_doc_orimanga-ocr-basegranite-vision-3.3-2bPP-LCNet_x1_0_textline_oritrocr-base-printedLightOnOCR-1B-1025PP-OCRv5_server_rectrocr-large-handwrittenkosmos-2-patch14-224donut-basetrocr-base-stage1nougat-basetrocr-large-printedPP-OCRv5_mobile_dettrocr-base-handwrittenmgp-str-basePP-OCRv6_medium_detPP-OCRv6_medium_recPP-OCRv5_mobile_recmeiki.txt.recognition.v0vit-gpt2-image-captioninglatin_PP-OCRv5_mobile_rectrocr-small-printednougat-smallGLM-OCRNuMarkdown-8B-Thinkingen_PP-OCRv4_mobile_recPP-DocLayout_plus-LPP-OCRv4_mobile_detPP-DocBlockLayoutFalcon-OCR