Svara TTS V1

kenpath/svara-tts-v1

published Oct 2025 · updated Oct 2025

Svara TTS V1 is a multilingual text-to-speech model for 19 Indian languages and Indian English, supporting emotion control and speaker identities.

status

coming soon

API providers

downloads / mo

73.1K

license

apache-2.0

specs

Task	Text-to-Speech (TTS)
Architecture	Orpheus-style discrete audio token approach with vLLM engine and SNAC decoder
License	Apache-2.0
Languages	19 languages (18 Indic + Indian English)

about this model

Svara-TTS v1 is a multilingual text-to-speech model that generates speech in 19 languages (18 Indic languages plus Indian English) using an Orpheus-style discrete audio token architecture. It supports end-of-utterance emotion/style tags (<happy>, <sad>, <anger>, <fear>, <clear>) and simple speaker identity conventions (Language + Gender). The model is optimized for low-latency deployment on commodity GPUs and CPUs, and is LoRA-friendly for speaker or domain adaptation.

Key capabilities

38 voice profiles — male and female voices for each of the 19 languages.
Streaming audio generation with low-latency output.
OpenAI-compatible API — the inference server exposes a drop-in replacement for the /v1/audio/speech endpoint, usable with the OpenAI Python SDK.
Multiple output formats — MP3, Opus, AAC, WAV, and raw PCM via ffmpeg.
Zero-shot voice cloning from a short audio reference.
Long-text chunking — automatic sentence-boundary splitting with crossfade stitching for long inputs.

Training and data

Trained on more than 2,000 hours of open, high-quality speech from SYSPIN, RASA, IndicTTS, and SPICOR, covering approximately 50 speakers (balanced male/female) across 19 languages. The model encourages natural prosody and cross-lingual transfer.

Adoption

As of early 2026, the model has surpassed 500,000 downloads on Hugging Face and ranked #7 globally on the platform in February 2026.

Deployments include applications in agriculture (2,500,000+ farmers served), retail (60% faster checkout), and accessibility (100,000 blind lives impacted) through partner organizations.

best for

·Multilingual voice assistants and IVR systems
·Content localization for education and public services
·Accessibility tools for reading aids

FAQ

What languages does Svara TTS V1 support?

It supports 19 languages: Hindi, Bengali, Marathi, Telugu, Kannada, Bhojpuri, Magahi, Chhattisgarhi, Maithili, Assamese, Bodo, Dogri, Gujarati, Malayalam, Punjabi, Tamil, Nepali, Sanskrit, and Indian English.

How many voice profiles are available?

38 voice profiles: 19 languages each with male and female voices.

Can I control emotion in the generated speech?

Yes, by placing emotion tags like <happy>, <sad>, <anger>, or <fear> at the end of the utterance.

How do I call Svara TTS V1 via the API?

Use the gigarouter OpenAI-compatible endpoint with an API key. The endpoint mirrors OpenAI's /v1/audio/speech format.

What is the license of this model?

Apache-2.0, allowing free use, modification, and distribution with attribution.

not yet live

We're benchmarking and onboarding Svara TTS V1 as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related text-to-speech models

compare all →

XTTS-v2

9.3M dl/mo

Qwen3-TTS-12Hz-1.7B-CustomVoice

2M dl/mo

Qwen3-TTS-12Hz-0.6B-CustomVoice