skip to content
gigarouter gigarouter
models / speech-to-text · coming soon

Faster Whisper Base

Systran/faster-whisper-base

published Nov 2023 · updated Nov 2023

Faster Whisper Base is an ASR model that transcribes speech to text using OpenAI's Whisper base architecture optimized with CTranslate2 for faster inference.

status
coming soon
API providers
0
downloads / mo
1.4M
license
mit

specs

TaskAutomatic Speech Recognition (ASR)
ArchitectureWhisper Base (Transformer)
LicenseApache 2.0
QuantizationFP16 (weights saved in float16)
Supported Languages99 languages (same as original Whisper base)

about this model

Systran/faster-whisper-base is an automatic speech recognition (ASR) model that converts spoken language into text. It is the openai/whisper-base model optimized for efficient inference using the CTranslate2 runtime, which powers the faster-whisper library. The model weights are converted to the CTranslate2 format with FP16 quantization, and the runtime applies performance optimizations such as layer fusion, padding removal, and batch reordering to accelerate inference and reduce memory usage on both CPU and GPU.

Benchmark performance

On the LibriSpeech test set, the original Whisper base model achieves the following word error rates (WER):

  • test-clean: 5.01%
  • test-other: 12.85%

On Common Voice 11.0 Hindi test, WER is 131% (source: original model card). The model is released under the Apache 2.0 license.

Model size with quantization

CTranslate2 supports multiple precision levels that significantly reduce the model’s storage footprint while maintaining accuracy. The table below shows the disk size of the converted base Transformer model for each compute type (source: CTranslate2 quantization docs). When a compute type is not natively supported on a given hardware, CTranslate2 automatically falls back to a compatible alternative.

Compute typeSize
float32364 MB
int16187 MB
float16182 MB
bfloat16182 MB
int8_float32100 MB
int8_float1695 MB
int8_bfloat1695 MB

As a hosted API on gigarouter, this model is available for direct, OpenAI-compatible integration without requiring local installation or model conversion.

best for

FAQ

How does this model differ from the original OpenAI Whisper base?

This model is converted to CTranslate2 format with FP16 quantization, enabling faster inference and lower memory usage.

What languages does it support?

It supports the same 99 languages as the original Whisper base model, including English, Chinese, Spanish, and more.

How can I use this model via the gigarouter API?

Send audio to the OpenAI-compatible endpoint with your API key, using the /v1/audio/transcriptions route.

What is the license for this model?

The model is licensed under Apache 2.0.

What is the expected word error rate (WER) on LibriSpeech?

On LibriSpeech test-clean, the original model achieves 5.01% WER; on test-other, 12.85% WER.

not yet live

We're benchmarking and onboarding Faster Whisper Base as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related speech-to-text models

compare all →