WhisperKit
argmaxinc/whisperkit-coreml
published Feb 2024 · updated Apr 2026
WhisperKit is an automatic speech recognition (ASR) model that transcribes speech to text, optimized for on-device inference on Apple Silicon via CoreML.
specs
| Task | Automatic Speech Recognition (ASR) |
| Architecture | OpenAI Whisper (CoreML optimized) |
| License | MIT |
| Platform | Apple Silicon |
about this model
WhisperKit is an automatic speech recognition (ASR) model that provides on-device speech-to-text transcription using OpenAI Whisper, optimized for Apple Silicon via CoreML. It is part of the larger Argmax Open-Source SDK, which also includes SpeakerKit for speaker diarization (based on Pyannote) and TTSKit for text-to-speech (based on Qwen-TTS). The model was presented at ICML 2025.
WhisperKit is designed for efficient, low-latency inference on Apple hardware, requiring macOS 14.0 or later and Xcode 16.0 or later. The SDK is released under the MIT License (Copyright 2024 argmax, inc.). For production use cases requiring real-time transcription with speaker identification and custom vocabulary, the Argmax Pro SDK extends these capabilities with additional models and advanced features.
As a hosted API on gigarouter, WhisperKit delivers the same on-device ASR performance without the need to manage local infrastructure, providing OpenAI-compatible endpoints for seamless integration into existing workflows.
best for
- ·On-device speech-to-text on Apple devices
- ·Offline transcription of audio files (meetings, podcasts, voice memos)
- ·Building voice-controlled apps for macOS and iOS
FAQ
It is designed for on-device automatic speech recognition on Apple Silicon, converting speech to text without requiring cloud connectivity.
Yes, it can be accessed via gigarouter's OpenAI-compatible endpoint with an API key.
The model is released under the MIT License.
It requires macOS 14.0 or later and Xcode 16.0 or later to run on Apple Silicon.
No, the open-source version provides batch transcription. Real-time transcription with speaker diarization is available in Argmax Pro SDK.
We're benchmarking and onboarding WhisperKit as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.