rate card
Models & pricing
The specialist models we've benchmarked, hosted and priced — with the long tail we're onboarding next below. Prices are in each model's native unit; realtime is the on-demand rate, batch is a discounted flexible tier (send X-Tier: batch).
allembeddingsspeech-to-textvision-languagezero-shot imagererankerimage-to-texttext-to-speechobject detectiondepth estimationtext generation
37 matches in text-to-speech · clear
no live models match — see the roadmap below or clear the filter.
| model | task | tier | realtime | batch |
|---|
On the roadmap
37 modelsHigh-demand specialist models with no hosted API. We benchmark and onboard them by task - each has a page; sign in and tell us which you need to jump the queue.
text-to-speech · 37
XTTS-v2Qwen3-TTS-12Hz-1.7B-CustomVoiceQwen3-TTS-12Hz-0.6B-CustomVoiceMOSS-TTSOmniVoiceF5-TTSindic-parler-ttsQwen3-TTS-12Hz-1.7B-VoiceDesignVoxCPM2VibeVoice-Realtime-0.5BKokoro-82M-v1.0-ONNXQwen3-TTS-12Hz-0.6B-Bases2-procsm-1bVibeVoice-1.5BMOSS-TTS-v1.5higgs-tts-2-3b-basemms-tts-engMeloTTS-EnglishVieNeu-TTS-v3-Turbohf-seamless-m4t-mediumneutts-nanoE2-TTShiggs-tts-3-4bF5-TTS_RUSSIAN3b-de-ft-research_releaseMOSS-TTS-Nano-100Mspeecht5_ttsKimi-Audio-7B-InstructVieNeu-TTS-v2Voxtral-4B-TTS-2603svara-tts-v1MeloTTS-SpanishMeloTTS-Koreansupertonic-3higgs-audio-v3-tts-4b-transformerssarashina2.2-tts