Question 1

What is the input format for the Whisper Tiny API?

Accepted Answer

The API accepts audio as a file upload or a base64-encoded PCM 16-bit mono 16 kHz waveform. The model internally converts audio to log-Mel spectrograms.

Question 2

How does Whisper Tiny compare in speed to larger Whisper variants?

Accepted Answer

Whisper Tiny is the smallest and fastest model, roughly 10x faster than large and requires about 1 GB VRAM.

Question 3

What languages does Whisper Tiny support?

Accepted Answer

It supports 98 languages for speech recognition and can translate from many of those languages into English. Performance varies by language, especially for low-resource ones.

Question 4

How can I call the Whisper Tiny model on gigarouter?

Accepted Answer

Use the OpenAI-compatible endpoint with your gigarouter API key, sending a POST request to the /v1/audio/transcriptions or /v1/audio/translations path with the audio file.

Question 5

Is the model fine-tunable or available for local deployment?

Accepted Answer

The MIT license allows free use, modification, and distribution. The model can be deployed locally using the openai-whisper Python package and a compatible GPU, but gigarouter provides a hosted API.

Task	Automatic Speech Recognition (ASR) & Speech Translation
Architecture	Transformer encoder-decoder (sequence-to-sequence)
Parameters	39 M
License	MIT

Size	Parameters	English-only	Multilingual
tiny	39 M	✓	✓
base	74 M	✓	✓
small	244 M	✓	✓
medium	769 M	✓	✓
large	1550 M	✗	✓
large-v2	1550 M	✗	✓

Whisper Tiny

specs

about this model

Architecture and training

Key strengths and benchmarks

Model sizes

Known limitations

best for

FAQ

related speech-to-text models