By Denys Medvediev

Guide

How to add custom words to dictation

Dictation mishears names, jargon, and brands because they aren't in its vocabulary. The fix is to teach it: Windows has a Speech Dictionary you add words to, and a hotword list in Whisper biases a local model toward your terms so they transcribe right.

Last updated: June 2026

Open dictionary on a desk beside a keyboard, evoking words and vocabulary

To add custom words to dictation, teach the tool your vocabulary. On Windows, the Speech Recognition Speech Dictionary has an "Add a new word" wizard. In Whisper by Remskill, a hotword list biases a local Whisper model toward names, jargon, and brands. macOS Dictation itself has no custom-word setting; that lives in Voice Control.

Every dictation tool I've used gets common English right and then mangles the one word that actually matters. It nails "schedule the review for Thursday" and then turns my colleague Csaba into "chubba," my product into "project alfalfa," and "Kubernetes" into "cooper netties." The words a transcriber struggles with are exactly the ones you can't fix with a louder voice — they're names, jargon, and brands it has never been trained to expect.

So people search for how to add custom words to dictation, expecting a tidy settings page. The honest answer is that it depends on the tool. Windows has a real, editable dictionary you can type words into. macOS keeps that feature somewhere most people never look. And a local Whisper model can be biased toward your terms with a hotword list. I'll walk all three, set the Whisper one up, and tell you when the built-in is already enough.

Here's the part most pages skip. A transcriber doesn't "spell" a word the way you do — it guesses the most likely words for the sounds it hears. "Csaba" loses to "chubba" because the model has heard "chubba"-shaped sounds a million times and your colleague's name almost never. Adding a custom word doesn't teach the model new letters. It tilts the guess.

That tilt is built differently in each tool. Windows stores it in a Speech Dictionary you edit by hand. Whisper's local models take a list of hotwords and weight toward them during transcription. And — the one caveat that trips people up — that hotword biasing is a local-Whisper feature only. Parakeet doesn't take hotwords, and the cloud path doesn't either. I'll be specific about which is which, because getting it wrong wastes an afternoon.

Why dictation mangles names and jargon

Handwritten sticky notes with names and terms on a wall, like a personal glossary

Dictation is a betting machine. It listens to a stretch of sound and picks the words most likely to have made that sound, based on the millions of hours it was trained on. Everyday speech wins those bets easily. The trouble starts with anything rare: a coworker named Csaba, an internal project called Helios, a drug name, a law firm, your own surname if it isn't common in English.

The model has barely heard those, so it reaches for a common word that sounds close. "Helios" becomes "healy us." "Remskill" becomes "rem skill" or "rim skill." You end up correcting the same five words every single time, which is the exact tax that makes people give up on dictation and go back to the keyboard. The fix isn't a better microphone or slower speech. It's telling the tool, ahead of time, that these specific oddball words are on the table.

That's what a custom word does. You're not teaching pronunciation in most tools — you're adding the word to the list of things the transcriber is allowed to expect, so when the sounds are ambiguous, your term wins the bet instead of the common word that's been beating it. The boring truth is that a short list of ten or fifteen terms covers most of the pain for most people. You don't need to feed it a dictionary. You need to feed it the handful of words it keeps fumbling.

The built-in way, on Windows and Mac

Start with what's already on your machine, because for some people that's the whole answer. On Windows there are two separate built-ins, and they handle custom words very differently. Windows Speech Recognition — the older desktop feature — has a genuine, editable Speech Dictionary. You open Speech Recognition, say or click "open Speech Dictionary," choose "Add a new word," and follow the wizard. That word is now something dictation will recognise. The newer Windows 11 Voice Access has its own version: an "Add to Vocabulary" command (and a Help-menu option) that biases recognition toward words you add.

The plain Win+H voice typing bar most people use day to day is the in-between case. It doesn't give you a dictionary to type into directly; it learns from the corrections you make over time and from your typed text, rather than from a list you edit. So if you want a hand-edited custom-word list on Windows today, the Speech Recognition Speech Dictionary or Voice Access vocabulary is where it lives — not the Win+H bar.

Cancel
The recording overlay: a small capsule that appears while you speak, so you know it's listening.

macOS is where you have to be careful, because the obvious feature doesn't have this. Standard macOS Dictation — the thing you trigger to talk into any text field — has no custom-word or custom-vocabulary setting. None. What does exist is a separate accessibility feature, Voice Control, which has a Vocabulary panel under System Settings, Accessibility, Voice Control, where you can add up to 1000 terms and even record how each is pronounced. It's real and it's good, but it's a different tool from the Dictation most Mac users mean. If a page tells you "just add custom words in macOS Dictation," it's quietly conflating the two.

Set up custom words in Whisper (Windows or Mac)

If you want one consistent way to add custom words that works the same on Windows and Mac, that's where a dedicated tool earns its place. You need a Mac on Apple Silicon or a Windows 10-or-newer PC, a working microphone, and a local Whisper model — hotwords are a local-Whisper feature, so this path needs that model, not Parakeet and not cloud. The whole local pipeline is free for any signed-in account, with no payment method asked for at sign-up. Here's the sequence.

Step 1 — Install Whisper and sign in.

Download from the download page, install, and create a free account. No card. The local transcription pipeline opens right away.

You'll know it worked when the app's tray icon appears and the setup wizard offers to pick a model.

Step 2 — Choose a local Whisper model.

The app presents three paths — Cloud, Local Parakeet, Local Whisper. For custom words, pick Local Whisper, because the hotword list works with Whisper models. Parakeet is faster but takes no hotwords; cloud takes none either.

You'll know it worked when a Whisper model finishes downloading and shows as ready.

Step 3 — Add your terms to the hotword list.

In the Whisper model's settings, add the names, jargon, and brands it keeps missing — one term per entry. Keep the list short and specific: the words it actually fumbles, not your whole glossary.

You'll know it worked when your saved terms appear in the list and stay there between recordings.

Step 4 — Dictate and check the hard words.

Put your cursor in any text field, hold the hotkey, say a sentence that includes one of your terms, and release. The transcript pastes at the cursor with your word spelled the way you saved it.

You'll know it worked when the term that used to come out wrong now comes out right.

Whisper
The real Whisper desktop app on the settings screen, with the Transcription and AI panels open.

I'd keep the first list deliberately small. Add the five or six words that have annoyed you most this week, dictate for a day, and add more only when something else trips. A hotword list bloated with two hundred terms can start nudging the model toward words you didn't mean. Short and specific beats long and hopeful.

voice to text on Windows · on Mac

What a hotword list actually does

A hotword list is a set of terms you hand the model before it transcribes, so it knows to expect them. Under the hood it's the same idea as the Windows Speech Dictionary, just wired differently: instead of an entry in a stored dictionary, the words ride along with each recording as a bias. When the audio is ambiguous between your term and a common look-alike, the bias tips the decision toward your term. "Csaba" stops losing to "chubba" because you've told the model Csaba is a word that belongs here.

Two honest limits are worth stating plainly. First, hotwords nudge, they don't force — a term that sounds nothing like what you said still won't appear, and a very short or very unusual word can still slip. Second, and this is the one people get wrong: hotwords are a local-Whisper feature. Parakeet, the fast local engine, takes no hotword list. The cloud path doesn't expose one either. So if custom words are the reason you're here, the local Whisper model is the path that has them.

The local Whisper models also give you finer control than most built-ins — things like beam-size and custom vocabulary that the average dictation box doesn't expose. You don't need any of that to add a few names. But it's there if you graduate from "fix five words" to "transcribe a medical clinic's terminology all day," which is a real reason some people pick Whisper over the faster Parakeet engine. If you're weighing the local models against each other, which Whisper model to use walks through the tradeoffs.

Local or cloud, when custom words are the goal

The app makes you pick a path, and for custom words the pick matters more than usual, because only one of the three takes a hotword list. Here's the honest breakdown, so you choose with your eyes open rather than discovering the limit after you've installed the wrong engine.

The three paths, and what each does about your vocabulary:

  • Local ParakeetNVIDIA's TDT engine, around 600 MB, the fastest local option — 5 to 10 times faster than Whisper on CPU. English plus 24 other European languages, 25 in total. No translate-to-English, and the one that matters here: no hotwords. Great for fast everyday English dictation, wrong pick if custom words are the reason you came.
  • Local Whisperslower than Parakeet on the same machine, but this is the path with the hotword list and custom-vocabulary control. The multilingual builds cover 99 languages and can translate to English; the English-only builds are English-only. Default English model is around 480 MB. If you need names and jargon transcribed right, this is the one.
  • Cloud (OpenAI, BYOK)best general accuracy and web access, using your own OpenAI key billed straight by OpenAI. Transcription runs on gpt-4o-mini-transcribe by default. It often gets rare words right on raw strength, but it doesn't expose a hotword list. Needs internet. The Cloud surface is part of Whisper Pro.

So the rule of thumb is simple. If custom words are your main problem and you want a list you control, use a local Whisper model. If you mostly speak common English and want raw speed, Parakeet is the better daily driver — just don't expect a hotword box. Cloud is the escape hatch when you want top accuracy on a hard recording and don't mind it leaving your machine. If you're deciding on the local setup overall, how to run Whisper locally and the Parakeet model breakdown cover both engines in depth.

Fixing what slips through, after the fact

No custom-word setup catches everything, and raw dictation always lands as a bit of a run-on. You say "meet csaba about the helios rollout um tag it project alpha," and even with hotwords on, the punctuation and filler are still yours to clean. This is where the two halves of the job split: hotwords fix the spelling of hard words, and a cleanup pass fixes the shape of the sentence.

Windows Voice Typing adds punctuation as you speak, and macOS Dictation handles basic punctuation when you say "comma" or "period." For heavier cleanup — stripping the "ums," fixing run-ons, tidying a spoken paragraph into something you'd actually send — Whisper can run an AI pass. Say the activation phrase "Hey whisper" and the text gets enhanced before it lands. On a local model that runs through Ollama; in cloud mode it's gpt-5-mini by default. The cleanup pass leaves your custom-word spellings intact while fixing everything around them.

Thinking...
Raw

meet csaba about the helios rollout um tag it project alpha before the standup thursday

Cleaned

Meet Csaba about the Helios rollout, tag it Project Alpha, before the standup Thursday.

For the words that still slip past both the hotword list and the cleanup pass, the old standby applies: fix it once, and on Windows add it to the Speech Dictionary so it's not a problem the next time. There's no shame in a manual correction now and then. The goal isn't a tool that's never wrong; it's a tool that's wrong about the same five words once instead of forty times. Custom words get you most of the way; a quick edit covers the tail.

That same speak-then-clean rhythm is worth getting comfortable with everywhere, because once it clicks you can dictate cleanly on Windows into any app you open, not just the one you set out to fix.

When the built-in is enough

A single sticky note on a clean desk, suggesting a small, sufficient solution

Sometimes you don't need a dedicated tool at all, and pretending otherwise would be dishonest. If your custom-word problem is small — a couple of names, on Windows, that you can add once and forget — the Windows Speech Recognition Speech Dictionary already does exactly this for free. Add the words, move on. Installing anything extra for that is overkill.

On Mac the picture is honestly more mixed, and worth being straight about. Standard macOS Dictation has no custom-word list, so if that's all you use, your built-in options for adding terms are genuinely limited. Voice Control's Vocabulary panel does the job and holds up to 1000 terms, but it's an accessibility feature you'd be turning on specifically for this — fine if you're comfortable there, a detour if you're not. So on Mac the trade is real: live with Dictation's misses, learn Voice Control, or run a tool with its own hotword list.

Reach for a dedicated, system-wide tool when the built-ins start hurting: a long list of names and jargon, the same custom words needed on both Windows and Mac, offline privacy, or wanting one hotkey and one vocabulary that behave the same in every app. Below that bar, use what's free. I'm not going to tell you to install software to teach your computer one surname.

The same trade-off shows up if your dictation lives mostly on a Mac — the built-in limits and the honest workarounds in voice to text on Mac are the fuller version of this section.

Adding custom words is the least glamorous dictation feature and the one that decides whether you keep using it. Get the five words it keeps fumbling onto a list — the Speech Dictionary on Windows, a hotword list in Whisper — and the daily friction quietly disappears. I added my own surname to a hotword list two years ago and haven't watched a transcriber butcher it since, which is a low bar and exactly the kind of bar I want cleared before breakfast.

Teach it the words it keeps missing

Add your names, jargon, and brands to a local Whisper model's hotword list, then dictate. The terms it used to mangle land spelled the way you saved them — in every app you open.

Free local mode for any signed-in account. No card required to start.

Photo of Denys Medvediev

Denys Medvediev

I'm the one who reads our support email, most probably by dictating the replies.