wav2vec2 large xls r 300m Urdu
kingabzpro/wav2vec2-large-xls-r-300m-Urdu
published Mar 2022 · updated Jun 2026
A popular open speech-to-text model, with 2.3M downloads a month. gigarouter benchmarks and hosts it as an OpenAI-compatible API.
specs
| Task | Automatic Speech Recognition (ASR) |
| Architecture | Wav2Vec2 CTC (XLS-R 300M backbone) |
| Parameters | 300M |
| License | Apache-2.0 |
about this model
kingabzpro/wav2vec2-large-xls-r-300m-Urdu is an automatic speech recognition (ASR) model that transcribes Urdu speech from 16 kHz mono audio using a Connectionist Temporal Classification (CTC) decoder, with an optional 5-gram KenLM language model for improved accuracy.
It is a fine-tuned version of Facebook’s XLS-R 300M, which was pretrained on 436k hours of unlabeled speech across 128 languages (VoxPopuli, MLS, CommonVoice, BABEL, VoxLingua107). The base model’s architecture achieved relative word error rate reductions of 14–34% across multiple benchmarks and improved CoVoST-2 speech translation by an average of 7.4 BLEU.
Key strengths
- Best reported result: 39.89% WER / 16.70% CER on the Urdu Common Voice 8.0 test set when decoded with the included 5-gram KenLM language model. This is a 29% relative WER improvement over greedy CTC (56.07% WER).
- Efficient decoding: Greedy CTC yields faster, lightweight inference; the KenLM decoder boosts accuracy with minimal overhead.
- Reproducible evaluation: A Kaggle notebook provides a five-sample smoke test and full evaluation script.
Benchmark results
| Decoder | Test WER | Test CER |
|---|---|---|
| Greedy CTC | 56.07% | 23.70% |
| 5-gram KenLM | 39.89% | 16.70% |
The model is released under the Apache-2.0 license. It is hosted as a managed, OpenAI-compatible API by gigarouter.
FAQ
16 kHz mono waveform audio.
Yes, an optional 5-gram KenLM language model is provided to improve accuracy.
39.89% WER with KenLM decoding; 56.07% with greedy CTC.
Apache-2.0.
Use the OpenAI-compatible endpoint with your gigarouter API key, sending audio data as per the documentation.
We're benchmarking and onboarding wav2vec2 large xls r 300m Urdu as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.