Question 1

What is F5-TTS Russian?

Accepted Answer

It is a fine-tuned version of the F5-TTS model, adapted for Russian speech synthesis using over 5000 hours of Russian and English audio data.

Question 2

How do I control accents in generated speech?

Accepted Answer

Place a + before the stressed vowel in a word (e.g., молок+о produces молокó). You can also use the RUAccent library for automatic accent placement.

Question 3

What license does this model use?

Accepted Answer

The original F5-TTS is licensed under CC-BY-NC-4.0 (non-commercial). The Russian fine-tune inherits this license; check the model card for any updates.

Question 4

How can I call this model via the gigarouter API?

Accepted Answer

Use the OpenAI-compatible endpoint with your gigarouter API key; send a POST request with input text and any optional parameters like accent markup.

Question 5

What training data was used?

Accepted Answer

The model was trained on a custom 4000-hour Russian dataset, plus Common Voice (RU and EN), Sova audiobooks, and partial LibriHeavy, totalling over 5000 hours.

Task	Text-to-Speech (TTS)
Architecture	Diffusion Transformer (DiT) with ConvNeXt V2 text encoder and Flow Matching
License	CC-BY-NC-4.0
Training Data	5000+ hours of Russian and English speech

Source	Hours
Custom Russian dataset	4,000
Common Voice RU	239
Common Voice EN	240
Sova (RuDevices + RuAudiobooks)	400
LibriHeavy (partial)	180

F5-TTS Russian

specs

about this model

Key strengths

best for

FAQ

related text-to-speech models