skip to content
gigarouter gigarouter
models / text-to-speech · coming soon

MeloTTS Korean

myshell-ai/MeloTTS-Korean

published Feb 2024 · updated Feb 2024

MeloTTS Korean is a high-quality Korean text-to-speech model that generates natural-sounding speech from text, optimized for real-time CPU inference.

status
coming soon
API providers
0
downloads / mo
68.4K
license
mit

specs

TaskText-to-Speech (TTS)
ArchitectureVITS-based (VITS, VITS2, Bert-VITS2)
LanguageKorean
LicenseMIT

about this model

myshell-ai/MeloTTS-Korean is a text-to-speech model that generates natural Korean speech as part of the MeloTTS multilingual family developed by MyShell.ai in collaboration with MIT and Tsinghua University. The model produces high-quality audio for Korean text and supports CPU real-time inference, making it suitable for low-latency applications without dedicated GPU hardware.

Multilingual family

The Korean model is one of several language-specific variants. The full MeloTTS family covers the following languages with distinct accents where applicable:

LanguageExample Audio (Korean)
KoreanListen
English (American, British, Indian, Australian, Default)Example
SpanishExample
FrenchExample
Chinese (mixed EN)Example
JapaneseExample

Key capabilities

  • CPU real-time inference: the model processes audio fast enough for real-time use on CPU.
  • Mixed-language support: the Chinese variant can handle mixed Chinese and English text; the Korean model outputs natural Korean.
  • Speed control: playback speed can be adjusted via the API parameter.

Community adoption

The Korean model has received approximately 68,449 downloads in the past month and is used in 19 Hugging Face Spaces. Three quantized versions are available for reduced model size while maintaining quality.

Authorship and citation

The MeloTTS project is authored by Wenliang Zhao (Tsinghua University), Xumin Yu (Tsinghua University), and Zengyi Qin (MIT and MyShell). The recommended citation is:

@software{zhao2024melo, author={Zhao, Wenliang and Yu, Xumin and Qin, Zengyi}, title={MeloTTS: High-quality Multi-lingual Multi-speaker Text-to-Speech}, year={2024}}

best for

FAQ

What languages does MeloTTS Korean support?

This specific model supports Korean. Other MeloTTS models cover English, Spanish, French, Chinese, and Japanese.

How fast is inference on CPU?

The model is fast enough for real-time CPU inference, as stated in the official documentation.

What is the license of MeloTTS Korean?

It is licensed under MIT, allowing both commercial and non-commercial use.

How can I call this model via the gigarouter API?

Use the OpenAI-compatible endpoint by sending a request with your API key and text input; the response contains audio in WAV format.

not yet live

We're benchmarking and onboarding MeloTTS Korean as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related text-to-speech models

compare all →