skip to content
gigarouter gigarouter
models / text-to-speech · coming soon

Svara TTS V1

kenpath/svara-tts-v1

published Oct 2025 · updated Oct 2025

Svara TTS V1 is a multilingual text-to-speech model for 19 Indian languages and Indian English, supporting emotion control and speaker identities.

status
coming soon
API providers
0
downloads / mo
73.1K
license
apache-2.0

specs

TaskText-to-Speech (TTS)
ArchitectureOrpheus-style discrete audio token approach with vLLM engine and SNAC decoder
LicenseApache-2.0
Languages19 languages (18 Indic + Indian English)

about this model

Svara-TTS v1 is a multilingual text-to-speech model that generates speech in 19 languages (18 Indic languages plus Indian English) using an Orpheus-style discrete audio token architecture. It supports end-of-utterance emotion/style tags (<happy>, <sad>, <anger>, <fear>, <clear>) and simple speaker identity conventions (Language + Gender). The model is optimized for low-latency deployment on commodity GPUs and CPUs, and is LoRA-friendly for speaker or domain adaptation.

Key capabilities

  • 38 voice profiles — male and female voices for each of the 19 languages.
  • Streaming audio generation with low-latency output.
  • OpenAI-compatible API — the inference server exposes a drop-in replacement for the /v1/audio/speech endpoint, usable with the OpenAI Python SDK.
  • Multiple output formats — MP3, Opus, AAC, WAV, and raw PCM via ffmpeg.
  • Zero-shot voice cloning from a short audio reference.
  • Long-text chunking — automatic sentence-boundary splitting with crossfade stitching for long inputs.

Training and data

Trained on more than 2,000 hours of open, high-quality speech from SYSPIN, RASA, IndicTTS, and SPICOR, covering approximately 50 speakers (balanced male/female) across 19 languages. The model encourages natural prosody and cross-lingual transfer.

Adoption

As of early 2026, the model has surpassed 500,000 downloads on Hugging Face and ranked #7 globally on the platform in February 2026.

Deployments include applications in agriculture (2,500,000+ farmers served), retail (60% faster checkout), and accessibility (100,000 blind lives impacted) through partner organizations.

best for

FAQ

What languages does Svara TTS V1 support?

It supports 19 languages: Hindi, Bengali, Marathi, Telugu, Kannada, Bhojpuri, Magahi, Chhattisgarhi, Maithili, Assamese, Bodo, Dogri, Gujarati, Malayalam, Punjabi, Tamil, Nepali, Sanskrit, and Indian English.

How many voice profiles are available?

38 voice profiles: 19 languages each with male and female voices.

Can I control emotion in the generated speech?

Yes, by placing emotion tags like <happy>, <sad>, <anger>, or <fear> at the end of the utterance.

How do I call Svara TTS V1 via the API?

Use the gigarouter OpenAI-compatible endpoint with an API key. The endpoint mirrors OpenAI's /v1/audio/speech format.

What is the license of this model?

Apache-2.0, allowing free use, modification, and distribution with attribution.

not yet live

We're benchmarking and onboarding Svara TTS V1 as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related text-to-speech models

compare all →