skip to content
gigarouter gigarouter
models / text-to-speech · coming soon

Kokoro 82M v1.0

onnx-community/Kokoro-82M-v1.0-ONNX

published Feb 2025 · updated Feb 2025

A popular open text-to-speech model, with 576.6K downloads a month. gigarouter benchmarks and hosts it as an OpenAI-compatible API.

status
coming soon
API providers
0
downloads / mo
576.6K
license
apache-2.0

specs

TaskText-to-Speech (TTS)
ArchitectureStyleTTS 2 + ISTFTNet
Parameters82 million
LicenseApache-2.0
Languages8 languages (including American and British English)
Voices54 voices

about this model

Kokoro-82M-v1.0-ONNX is a text-to-speech (TTS) model that converts text input into audio output, designed for efficient, high-quality speech synthesis with 82 million parameters. It is based on the StyleTTS 2 architecture (arXiv 2306.07691) combined with ISTFTNet (arXiv 2203.02395), operating as a decoder-only model without diffusion or encoder components. The model supports 8 languages and offers 54 voices, including American and British English male and female variants. It is resilient to quantization, enabling efficient deployment at reduced sizes while maintaining audio quality; for example, the 8-bit quantized version (model_quantized.onnx) is 92.4 MB, compared to 326 MB for the full fp32 model. Training cost approximately $1000 for 1000 A100 80GB GPU hours, using a few hundred hours of permissive and non-copyrighted audio data, including public domain, Apache/MIT licensed, and synthetic sources. The model is licensed under Apache-2.0. For text preprocessing, the companion G2P engine misaki supports English, Japanese, Korean, Chinese, and Vietnamese. Audio samples for each voice and quantization variant are available below.

Voices and Samples

All samples use the sentence: "Life is like a box of chocolates. You never know what you're gonna get."

Voice NameNationalityGenderSample
af_heartAmericanFemale
af_alloyAmericanFemale
af_aoedeAmericanFemale
af_bellaAmericanFemale
af_jessicaAmericanFemale
af_koreAmericanFemale
af_nicoleAmericanFemale
af_novaAmericanFemale
af_riverAmericanFemale
af_sarahAmericanFemale
af_skyAmericanFemale
am_adamAmericanMale
am_echoAmericanMale
not yet live

We're benchmarking and onboarding Kokoro 82M v1.0 as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related text-to-speech models

compare all →