skip to content
gigarouter gigarouter
models / speech-to-text · coming soon

WhisperKit

argmaxinc/whisperkit-coreml

published Feb 2024 · updated Apr 2026

WhisperKit is an automatic speech recognition (ASR) model that transcribes speech to text, optimized for on-device inference on Apple Silicon via CoreML.

status
coming soon
API providers
0
downloads / mo
8M

specs

TaskAutomatic Speech Recognition (ASR)
ArchitectureOpenAI Whisper (CoreML optimized)
LicenseMIT
PlatformApple Silicon

about this model

WhisperKit is an automatic speech recognition (ASR) model that provides on-device speech-to-text transcription using OpenAI Whisper, optimized for Apple Silicon via CoreML. It is part of the larger Argmax Open-Source SDK, which also includes SpeakerKit for speaker diarization (based on Pyannote) and TTSKit for text-to-speech (based on Qwen-TTS). The model was presented at ICML 2025.

WhisperKit is designed for efficient, low-latency inference on Apple hardware, requiring macOS 14.0 or later and Xcode 16.0 or later. The SDK is released under the MIT License (Copyright 2024 argmax, inc.). For production use cases requiring real-time transcription with speaker identification and custom vocabulary, the Argmax Pro SDK extends these capabilities with additional models and advanced features.

As a hosted API on gigarouter, WhisperKit delivers the same on-device ASR performance without the need to manage local infrastructure, providing OpenAI-compatible endpoints for seamless integration into existing workflows.

best for

FAQ

What is the primary use case for WhisperKit?

It is designed for on-device automatic speech recognition on Apple Silicon, converting speech to text without requiring cloud connectivity.

Is WhisperKit available as a hosted API?

Yes, it can be accessed via gigarouter's OpenAI-compatible endpoint with an API key.

What is the license for WhisperKit?

The model is released under the MIT License.

What are the system requirements for running WhisperKit?

It requires macOS 14.0 or later and Xcode 16.0 or later to run on Apple Silicon.

Does the open-source version of WhisperKit support real-time transcription?

No, the open-source version provides batch transcription. Real-time transcription with speaker diarization is available in Argmax Pro SDK.

not yet live

We're benchmarking and onboarding WhisperKit as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related speech-to-text models

compare all →