Guide
Voice to text in the Substack editor
Substack's post editor has no built-in dictation. Its audio features narrate a finished post; they don't help you write one. The fix is a system-wide tool: press a hotkey, talk, and your words land at the cursor inside the editor.
Last updated: June 2026

Voice to text in the Substack editor works through a system-wide tool, not Substack itself. The Substack post editor has no dictation feature; its audio options only narrate finished text. A tool like Whisper fixes that: press a hotkey, speak, and the transcript pastes at the cursor in the editor, then an AI pass cleans the ramble.
I write a small newsletter, and most of my best paragraphs arrive while I'm walking the dog, not while I'm sitting at the keyboard. So I went looking for a way to talk a draft straight into the Substack editor. I found a lot of pages about Substack's voiceover feature. None of them were about what I actually wanted, which is the reverse.
Here is the confusion almost every search result trips over. Substack will happily turn your finished post into audio. It will not turn your audio into a post. Those are opposite directions, and the second one — dictating the draft — has no button anywhere in the editor. The fix lives outside Substack, takes about two minutes to set up, and works in every other app you write in too.
Here's the thing the voiceover articles bury. The Substack post editor is a browser rich-text box, the same kind as a Gmail compose window or a Google Doc. Dictation that pastes at your cursor doesn't care which box it's typing into.
So the real question isn't "how do I turn on dictation in Substack." There's no switch, and the audio menu you keep finding does the opposite job. The question is "which dictation tool do I run on top of the Substack editor," and the answer depends on whether you want free-and-built-in, or one offline hotkey that behaves the same everywhere. I'll walk all of it, set one up in two minutes, and tell you when to skip the dedicated route.
Does the Substack editor have dictation?

No. The Substack post editor has no built-in speech-to-text, dictation, or voice-typing feature for writing your draft by voice. There is no microphone button that types for you, no voice command, no hidden preference. If you've been hunting through the editor toolbar for it, you can stop. It isn't there.
What is there — and what every search result keeps handing you instead — is the audio menu behind the headphones icon. That's voiceover and text-to-speech. Voiceover lets you record yourself reading a post you've already written, or upload an audio file, so subscribers can listen. Text-to-speech, available on some publications, has a synthetic voice read your finished post aloud. Both take text and produce audio. Dictation does the exact opposite: it takes your voice and produces text. People conflate the two because both involve a microphone and the word "voice," and that conflation costs an afternoon of searching. I'd rather you skip that afternoon.
The distinction matters because it tells you where to look. You will not find dictation inside Substack, no matter how long you stare at the audio panel, because it was never built there. The editor is a text box that expects typing. To get your voice into it, you need a tool that sits on top of the browser and feeds text to wherever the cursor is. There are two honest routes, and the rest of this guide covers both.
Press a hotkey, talk, the words appear in the editor
This is the whole mechanic, and it's boring in the best way. You press a hotkey, you speak, you release, and the transcript pastes at your cursor, in whatever text field has focus. Whisper holds a short tail after you let go of the key, so your last word doesn't get clipped. Because it pastes at the OS cursor, the Substack editor is just "any text box." A web editor with no API to integrate, no plugin slot — and it doesn't matter, because dictation never touches Substack's code. It types where you're already typing.
That's the part the landing pages overcomplicate. There's nothing to install into Substack, no token to paste into your publication settings, no integration to approve. Your cursor is in the editor, you talk, the words appear in the editor. A small capsule shows up while you speak so you know it's listening:
The hotkey is the one thing worth getting right up front. On Windows it's Ctrl+Space; on Mac it's Command+Option, a modifier-only push-to-talk you hold while speaking. Both are changeable in Settings if they clash with something you already use. (My younger daughter once told me a hotkey "didn't work" in her drawing app. It was a conflict, not a bug, which is how I learned the average person has no idea what a hotkey conflict even is. So now every hotkey is customisable.) If you've ever set up dictation on Windows or on Mac, this is the same muscle memory pointed at your newsletter.
Set it up in two minutes (Windows or Mac)
You need a Mac on Apple Silicon or a Windows 10-or-newer PC, a working microphone, and Substack open in your browser with a draft post on screen. The whole local pipeline is free for any signed-in account, with no payment method asked for at sign-up. Here's the sequence.
Step 1 — Install Whisper and sign in.
Download from the download page, install, and create a free account. No card. The whole local transcription pipeline opens right away.
You'll know it worked when the app's tray icon appears and the setup wizard offers to pick a model.
Step 2 — Pick a transcription path.
The app doesn't choose for you. You get three: Cloud (OpenAI, bring your own key), Local Parakeet, or Local Whisper. For drafting posts on your own machine, start local — more on that two sections down.
You'll know it worked when a model finishes downloading and shows as ready.
Step 3 — Confirm your hotkey.
Windows defaults to Ctrl+Space, Mac to Command+Option held as push-to-talk. On Mac, grant the Accessibility permission when prompted; without it, the paste-at-cursor can't reach your browser.
You'll know it worked when a test recording pastes into any text field.
Step 4 — Put your cursor in the Substack editor and talk.
Open a draft, click into the body, hold the hotkey, say a sentence, release. The transcript appears where the cursor is, inside the editor.
You'll know it worked when your spoken sentence is sitting in the Substack draft as text.
The slow part is the model download, not the setup. Everything else is the four steps above. Once it's running, the act of getting a draft onto the page stops being a typing task and starts being a talking task — which, for a newsletter, is most of the job.
Why a browser editor needs a system-wide tool
The Substack editor runs in your browser, and that shapes what's even possible. Most apps you'd want to dictate into have a desktop version with deep hooks; a web rich-text editor has none of that. There's no plugin marketplace, no extension point Substack exposes for writing into a post. So the integration can't come from inside Substack. It has to come from a layer above the browser.
A system-wide hotkey is exactly that layer. It pastes at the OS cursor regardless of which window owns it, so the same key that fills your Substack draft also fills your Gmail compose box, a Slack message, and a commit message. One tool, every text field, on both Windows and Mac. You don't relearn anything when you move from drafting a post to answering a reader's email — it's the same press-talk-release everywhere.
This is also why a browser extension that only works in Substack would be the wrong shape for the problem. Writers don't live in one tab. You draft in Substack, you research in another window, you reply to comments in the app, you jot the next idea wherever it lands. A tool scoped to a single site solves a slice; a tool scoped to the cursor solves the whole thing. I switch apps roughly forty times an hour and don't want forty different dictation buttons to remember.
Local or cloud: which mode for drafting posts
For a newsletter draft, try local mode first. A half-finished post is your own raw thinking — opinions you haven't fully formed, a paragraph you might cut, the thing you're not sure you want to publish yet. It would be a strange choice to route every unpolished sentence through someone's cloud just to get it onto your own screen. If your Mac is Apple Silicon or your PC is from the last few years, local handles everyday dictation without complaint, and cloud becomes the escape hatch rather than the default.
Here's how the three paths differ, because the app makes you pick and I'd rather you pick well:
- Local Parakeet — NVIDIA's TDT engine, around 600 MB, and the fastest local option — 5 to 10 times faster than Whisper on CPU. Covers English plus 24 other European languages, 25 in total. No translate-to-English. If you write your newsletter in English or another European language, this is the quick, fully offline pick.
- Local Whisper — slower than Parakeet on the same machine, but the multilingual builds cover 99 languages and can translate to English. The English-only builds are English-only, not 99. Pick this for Chinese, Japanese, Korean, or any translation work, which Parakeet can't do. Default English model is around 480 MB.
- Cloud (OpenAI, BYOK) — best accuracy and web access, using your own OpenAI key billed straight by OpenAI. Transcription runs on gpt-4o-mini-transcribe by default. Needs internet, so it's the one path that leaves your machine. The Cloud surface is part of Whisper Pro.
The boring truth is that for the kind of prose most newsletter writers put on the page, local is plenty. Both local engines run fully on your machine with nothing sent to a server. Cloud earns its place when you want top-tier accuracy on a messy recording, or you want the model to pull a fact off the web while you're drafting. For a regular writing habit, start local and only reach for cloud when local leaves you wanting.
Turn a spoken ramble into a clean paragraph
Raw dictation comes out as a run-on. You talk the way you think, in one long unpunctuated breath, and that's the wall of text any speech engine hands you. For a finished email it's annoying. For a post you're going to publish under your own name, it's a problem — nobody wants to ship a paragraph that reads like a transcript. Cleaning it up is where the real value of drafting by voice shows.
Windows Voice Typing adds punctuation as you speak, and macOS Dictation handles basic punctuation when you say "comma" or "period." For heavier cleanup — stripping the "ums," fixing the run-ons, turning a spoken ramble into a paragraph you'd actually publish — Whisper can run an AI pass. Say the activation phrase "Hey whisper" and the text gets enhanced before it lands. On a local model that runs through Ollama; in cloud mode it's gpt-5-mini by default. The before-and-after is the whole pitch:
okay so the thing i wanted to say this week is that most productivity advice is just typing advice in disguise um like you don't need a better app you need to stop typing so much
The thing I wanted to say this week is that most productivity advice is just typing advice in disguise. You don't need a better app. You need to stop typing so much.
The honest limit is worth stating, because the demos won't. Voice gets you the words; it does not get you the formatting. Substack's headers, bold, block quotes, links, and that little dividing line all come from the editor's own toolbar and shortcuts. Dictate the sentence, then reach for the toolbar to make the H2 or drop the link the way you always do. No dictation tool conjures a Substack pull-quote into existence on command. Get the prose down fast by voice, then shape the post with the editor you already know.
That same speak-then-clean flow pays off well beyond your newsletter — you can also dictate clean prose into any app with the one hotkey, so a long block becomes a few spoken sentences instead of a paragraph you grind out by hand.
When to skip a dictation tool for Substack

Sometimes the right tool is the free one already on your machine, and pretending otherwise would be dishonest. If you only ever drop a short line into the editor — a one-sentence note, a quick reply in the comments — your operating system covers it for nothing.
On Windows, press Windows key + H and the built-in Voice Typing bar opens wherever your cursor is, the Substack editor included. It punctuates on its own and is fine for short bursts. The catch: it routes through Microsoft's servers and needs an internet connection, so it isn't an offline option. On Mac, Dictation lets you speak to enter text anywhere you can type, set up in System Settings under Keyboard, and on Apple Silicon general text can be processed on-device. For a quick sentence into a draft, either one is genuinely fine, and I'm not going to talk you out of free.
Reach for a dedicated, system-wide tool when the built-ins start hurting: full-length posts, the AI cleanup pass that turns a ramble into publishable prose, multilingual writing, offline drafting, or wanting one hotkey that behaves the same in Substack, your email, and your notes app. A newsletter is the long-form case, so most writers cross that line fast — but below it, use what's free. I'm not going to tell you to install an app for a one-line comment reply.
The same trade-off shows up wherever you write — the logic in dictating into Gmail is identical, because both are browser text boxes where the cursor, not a plugin, is the real integration.
Substack will read your finished post aloud, and it does that well. It just won't help you write the thing in the first place, and after writing this I'm fairly sure it never set out to. That's fine, because the cursor is the integration. Talk into the editor, get text, clean it with one AI pass, shape the post with the toolbar you already know. I drafted most of this guide by voice into a box that wasn't Substack, with a tool that doesn't care which box it is, then pasted the lot into a draft. That's the whole trick.
Try it in your next Substack draft
Hold the hotkey, talk, release. The transcript lands in the editor where your cursor is — then one AI pass turns the ramble into a paragraph you'd publish.
Free local mode for any signed-in account. No card required to start.



