By Denys Medvediev

Guide

Whisper for Mac

"Whisper for Mac" means one of two things. Either it is the open-source OpenAI Whisper model running on a Mac through Python and the command line, or it is a Mac app that uses Whisper under the hood. Most people want the second one. They just don't know it yet.

Last updated: June 2026

A MacBook and microphone on a desk, evoking Whisper voice dictation on Mac

Whisper for Mac is two different things wearing one name. The model is open source and free, but the official way to run it needs Python and the command line, and it transcribes files rather than your live speech. If you want to press a hotkey and have your words land in any Mac app, you want a dictation app such as Whisper by Remskill, whose entire local pipeline is free for any signed-in user.

Whisper is a model, not a Mac app

Let me clear up the naming, because the search results blur it together.

Whisper is an open-source speech-to-text model from OpenAI, released under the MIT License. The model is free. The code is free. You can download the weights and run them on your own machine, no account required. That part genuinely is "Whisper for Mac" in the literal sense.

The catch is how you run it. The official OpenAI Whisper is a Python and command-line tool. You install it with pip, you also need the ffmpeg command-line tool, and then you point it at an audio file. It transcribes recordings: audio.mp3, audio.wav, that kind of thing. It does not type your live speech into Mail or Slack. It turns a file you already have into text.

There are six model sizes (tiny, base, small, medium, large, and turbo), four of them with English-only variants that trade speed for accuracy. Whisper is multilingual and can translate speech into English with a single flag. Good model. The boring truth is that the model was never the hard part. Wiring it into the way you actually work on a Mac is.

The Mac tools that wrap Whisper, and what each one is for

Most people searching "Whisper for Mac" don't want to touch pip. They want an app. There are several good ones, and they are not interchangeable. They split into two camps.

Camp one: transcribe files

whisper.cpp is a plain C/C++ port of Whisper, MIT licensed, and it is a first-class citizen on Apple Silicon, optimized with ARM NEON, the Accelerate framework, Metal, and Core ML. It runs CPU-only, you build it from source, and you drive it from the command line. If you are comfortable in a terminal and want raw, fast, local file transcription, it is excellent. MacWhisper gives you a graphical version of that idea. It transcribes audio and video files on-device using OpenAI Whisper and NVIDIA Parakeet, with no data leaving your machine, plus a system-wide dictation feature. If your job is turning recordings into transcripts, that camp is the right one.

Camp two: type your live speech

This is dictation. You press a hotkey, you talk, and the text appears at your cursor in whatever app is focused. VoiceInk lives here. It's open source under GPL-3.0, it runs local models on the Apple Neural Engine including Parakeet v3, and it pastes at the cursor with a push-to-talk shortcut. It requires Apple Silicon and macOS 14.4 or later. superwhisper is here too, with live dictation plus file transcription, local or cloud, on Mac, Windows, and iOS.

Whisper by Remskill, the app I build, is in camp two. Dictation-first. Worth knowing which camp you're in before you download anything.

What Whisper by Remskill does on a Mac

I'll describe the thing I built, then you can judge it against the rest.

Whisper
The live Whisper by Remskill app — sidebar, transcription panel, and AI instruction cards. This is the real interface, not a screenshot.

It's a dictation app. You press a hotkey, you speak, and the text lands at your cursor in any app: Mail, Notes, Slack, your code editor, the box where you're typing this year's school permission slip. The default hotkey on a Mac is Command and Option held together, and it's fully remappable. All the transcription happens on your Mac. No file to upload, no recording to manage.

You also don't run Python. There's no pip, no ffmpeg, no terminal, no GPU. The whole thing is pure Rust. The Whisper and Parakeet engines run through a library called transcribe-rs, with no Python sidecar bundled in. Local transcription runs on your CPU, no dedicated GPU required, and the app is about 25 MB on disk.

For the model itself, you pick. Local Whisper gives you 8 models, 99 languages, translate-to-English, custom vocabulary, beam-size control, and hotword biasing. Slower, but the most control. Parakeet is the NVIDIA TDT engine, about 600 MB, and it runs 5 to 10 times faster than Whisper on a CPU, covering English plus 24 other European languages, with no translate-to-English. Cloud mode is the third path: you bring your own OpenAI key, and we take no cut. I deliberately don't pick one for you. We lay out the differences in Whisper vs Parakeet if you want the long version.

Local accuracy typically lands between 95% and 99%. The entire local pipeline is free for any signed-in user: Whisper, Parakeet, offline AI cleanup through Ollama, transcription history, presets, hotwords, hardware acceleration, model downloads, and the custom hotkey. No payment method at signup. You can use it on up to 3 devices. The paid tier, Whisper Pro, only adds the cloud surface: OpenAI cloud transcription, cloud AI cleanup, and web search. Pricing lives on the pricing page. I'm not quoting numbers here, because pricing pages move and you should read it straight from the source.

One honest constraint: our Mac build is Apple Silicon only, M1 through M4. If you're on an Intel Mac, this app is not for you, and I'll tell you what is in a minute.

Here is what your first dictation actually looks like. Press the hotkey, a small recording indicator appears, you talk, you release, and the cleaned-up text drops into wherever your cursor was sitting. The overlay below is the real thing the app shows, not a mockup.

Pasted
The shipped post-dictation "complete" overlay — the real app UI the moment a fully-local dictation finishes.

Setup is short. Download the app, sign in, let it pull down one model. Parakeet is the smallest at around 600 MB, or a Whisper model if you want languages or translation. Pick your hotkey or keep Command and Option. Then open Mail, hold the hotkey, and say a sentence. That's the whole onboarding. My younger daughter did it without asking me a single follow-up question, which is the only usability test I fully trust. If you want the longer, screenshot-by-screenshot walkthrough with all three model paths, I wrote a dedicated guide: voice to text on Mac.

Why I keep it local on a Mac

Here's my one strong opinion for this article: cloud-only dictation is a privacy disaster.

Your manager's salary spreadsheet, the email to your kid's school, the legal brief you're drafting on the train. None of that should pass through a vendor's servers because you wanted to type with your voice. Your Mac already has a microphone and a CPU. For one paragraph of dictation, it does not need a server in the loop. With the local engines, the audio never leaves your machine. That's the default I'd reach for, and it's free.

Cloud mode exists for when you actually want the latest OpenAI models or web answers, on your own key. It's the escape hatch, not the front door.

When MacWhisper, VoiceInk, or the CLI is the better pick

I'd be a bad guide if I pretended one app wins every case. It doesn't. Here is where I'd send you elsewhere.

You mostly transcribe recordings

If your day is feeding podcast episodes, interview recordings, or meeting captures into a transcript, you want a file-transcription tool, not a dictation app. MacWhisper is built for exactly that: drag a file in, get text out, on-device. Use it. We don't do file upload. We type your live speech.

You want raw, scriptable, free, and you live in the terminal

Then whisper.cpp is the answer. It's MIT licensed, Apple Silicon optimized, CPU-only, and you can pipe it into anything. If you're the kind of person who enjoys building from source, you'll be happier there than in any GUI.

You want fully open-source dictation and you're on Apple Silicon

VoiceInk is GPL-3.0, you can read or audit every line, and it pastes at the cursor like we do. It's a solid free option. We're a managed app, with accounts, history, cloud BYOK, and Windows support, and we're not open source. So if open source is a hard requirement, that's your call, and VoiceInk is a good one.

You're on an Intel Mac

Our app won't run. The open-source whisper.cpp can build and run on Intel, and Apple's own built-in Dictation is free for short notes. Either one beats waiting for an Apple Silicon machine you haven't bought yet.

If you only remember one thing

The model is free and open. The decision that matters is what you wrap around it: a terminal, a file-transcription GUI, or a hotkey that types your live speech into whatever you're looking at. Match the wrapper to the job, and on a Mac, ignore Python unless you genuinely enjoy it. There are three kinds of people who go looking for Whisper on a Mac: the ones with a folder of recordings, the ones who never want to type again, and the ones who just liked the name. Two of them are in the wrong camp until they read this far.

I dictated most of this article instead of typing it, which felt appropriate. The one paragraph I typed by hand had more typos.

Try it on your Mac

Download Whisper by Remskill, sign in, and dictate your first sentence on your Mac. The local tier is free, and you can decide later whether you ever need the cloud.

Free local transcription forever. No payment method at signup. Apple Silicon only.

Photo of Denys Medvediev

Denys Medvediev

I'm the one who reads our support email, most probably by dictating the replies.