Svara TTS V1
kenpath/svara-tts-v1
published Oct 2025 · updated Oct 2025
Svara TTS V1 is a multilingual text-to-speech model for 19 Indian languages and Indian English, supporting emotion control and speaker identities.
specs
| Task | Text-to-Speech (TTS) |
| Architecture | Orpheus-style discrete audio token approach with vLLM engine and SNAC decoder |
| License | Apache-2.0 |
| Languages | 19 languages (18 Indic + Indian English) |
about this model
Svara-TTS v1 is a multilingual text-to-speech model that generates speech in 19 languages (18 Indic languages plus Indian English) using an Orpheus-style discrete audio token architecture. It supports end-of-utterance emotion/style tags (<happy>, <sad>, <anger>, <fear>, <clear>) and simple speaker identity conventions (Language + Gender). The model is optimized for low-latency deployment on commodity GPUs and CPUs, and is LoRA-friendly for speaker or domain adaptation.
Key capabilities
- 38 voice profiles — male and female voices for each of the 19 languages.
- Streaming audio generation with low-latency output.
- OpenAI-compatible API — the inference server exposes a drop-in replacement for the
/v1/audio/speechendpoint, usable with the OpenAI Python SDK. - Multiple output formats — MP3, Opus, AAC, WAV, and raw PCM via ffmpeg.
- Zero-shot voice cloning from a short audio reference.
- Long-text chunking — automatic sentence-boundary splitting with crossfade stitching for long inputs.
Training and data
Trained on more than 2,000 hours of open, high-quality speech from SYSPIN, RASA, IndicTTS, and SPICOR, covering approximately 50 speakers (balanced male/female) across 19 languages. The model encourages natural prosody and cross-lingual transfer.
Adoption
As of early 2026, the model has surpassed 500,000 downloads on Hugging Face and ranked #7 globally on the platform in February 2026.
Deployments include applications in agriculture (2,500,000+ farmers served), retail (60% faster checkout), and accessibility (100,000 blind lives impacted) through partner organizations.
best for
- ·Multilingual voice assistants and IVR systems
- ·Content localization for education and public services
- ·Accessibility tools for reading aids
FAQ
It supports 19 languages: Hindi, Bengali, Marathi, Telugu, Kannada, Bhojpuri, Magahi, Chhattisgarhi, Maithili, Assamese, Bodo, Dogri, Gujarati, Malayalam, Punjabi, Tamil, Nepali, Sanskrit, and Indian English.
38 voice profiles: 19 languages each with male and female voices.
Yes, by placing emotion tags like <happy>, <sad>, <anger>, or <fear> at the end of the utterance.
Use the gigarouter OpenAI-compatible endpoint with an API key. The endpoint mirrors OpenAI's /v1/audio/speech format.
Apache-2.0, allowing free use, modification, and distribution with attribution.
We're benchmarking and onboarding Svara TTS V1 as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.