skip to content
gigarouter gigarouter
models / text-to-speech · coming soon

VieNeu TTS V2

pnnbao-ump/VieNeu-TTS-v2

published May 2026 · updated May 2026

VieNeu TTS V2 is a TTS model that synthesizes natural Vietnamese and bilingual English-Vietnamese speech with instant voice cloning from 3-5 seconds of audio.

est. price
~$0.0075
· estimated, set at launch
API providers
0
downloads / mo
78.5K
license
apache-2.0

specs

TaskText-to-Speech
Parameters0.3 billion
Training Data10,000+ hours English-Vietnamese
FeaturesInstant voice cloning, multi-speaker podcast mode, bilingual code-switching
Available FormatsPyTorch (GGUF Q4 for CPU)

about this model

VieNeu-TTS-v2 is a Vietnamese text-to-speech model that generates natural bilingual speech with instant voice cloning capabilities, supporting multi-speaker conversations and seamless English-Vietnamese code-switching.

Capabilities

  • Trained on 10,000+ hours of bilingual English-Vietnamese data for natural prosody.
  • Zero-shot voice cloning from 3–5 seconds of reference audio.
  • Multi-speaker dialogue mode with automatic character detection and emotional nuance.
  • High-fidelity pronunciation of mixed English-Vietnamese text via the sea-g2p phonemizer.
  • Preset voices across Northern and Southern accents, both male and female.

Reference Voices

Name Gender Accent
BìnhMaleNorth
TuyênMaleNorth
NguyênMaleSouth
HươngFemaleNorth
NgọcFemaleNorth
ĐoanFemaleSouth

VieNeu-TTS-v2 is developed by Phạm Nguyễn Ngọc Bảo. The model is hosted on gigarouter as a managed API compatible with OpenAI’s format, enabling developers to integrate high-quality Vietnamese TTS without local GPU infrastructure.

best for

FAQ

What input does the model accept?

Accepts text in Vietnamese or English, optionally a reference audio file for voice cloning, and an emotion mode (natural or storytelling).

What audio format does it output?

Outputs WAV audio files.

How can I call this model via API?

Use the gigarouter OpenAI-compatible endpoint with an API key.

Does it support multilingual speech?

Yes, it supports seamless English-Vietnamese code-switching in a single utterance.

How large is the model?

The PyTorch model is approximately 180 MB with 0.3 billion parameters. A GGUF Q4 quantized version is available for CPU deployment.

not yet live

We're benchmarking and onboarding VieNeu TTS V2 as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related text-to-speech models

compare all →