Wav2Vec2 Large XLSR-53 Persian
jonatasgrosman/wav2vec2-large-xlsr-53-persian
published Mar 2022 · updated Dec 2022
Wav2Vec2 Large XLSR-53 Persian is an automatic speech recognition model that transcribes Persian speech to text.
specs
| Task | Automatic Speech Recognition (ASR) |
| Architecture | Wav2Vec2 Large XLSR-53 |
| License | Apache 2.0 |
| Dataset | Common Voice 6.1 Persian (train and validation splits) |
about this model
jonatasgrosman/wav2vec2-large-xlsr-53-persian is an automatic speech recognition (ASR) model fine-tuned from the facebook/wav2vec2-large-xlsr-53 checkpoint on Persian speech data. It was trained on the train and validation splits of the Common Voice 6.1 dataset. The model accepts speech input sampled at 16 kHz and transcribes it into Persian text.
Key strengths
On the Common Voice Persian test set, the model achieves a Word Error Rate (WER) of 30.12% and a Character Error Rate (CER) of 7.37%. This performance surpasses other publicly available Persian ASR models, as shown in the benchmark below:
| Model | WER | CER |
|---|---|---|
| jonatasgrosman/wav2vec2-large-xlsr-53-persian | 30.12% | 7.37% |
| m3hrdadfi/wav2vec2-large-xlsr-persian-v2 | 33.85% | 8.79% |
| m3hrdadfi/wav2vec2-large-xlsr-persian | 34.37% | 8.98% |
This model is part of a 17-language multilingual fine-tuning suite by the same author, covering Arabic, Chinese, Dutch, Finnish, French, German, Greek, Hungarian, Italian, Japanese, Persian, Polish, Portuguese, Russian, and Spanish. It is released under the Apache 2.0 license and has a digital object identifier (DOI): 10.57967/hf/3576.
best for
- ·Transcribing Persian audio recordings such as interviews and lectures
- ·Voice-to-text for Persian-language applications
- ·Enabling search in Persian audio archives
FAQ
The model expects audio sampled at 16 kHz, provided as a waveform array or audio file.
The model achieves a WER of 30.12% and a Character Error Rate (CER) of 7.37%.
It is released under the Apache 2.0 license.
We're benchmarking and onboarding Wav2Vec2 Large XLSR-53 Persian as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.