Guide
Voice to text in Joplin
Joplin's desktop app has no built-in dictation — its voice typing is an Android-only feature. The fix is a system-wide tool: press a hotkey, speak, and the transcript pastes at your cursor in any Joplin note. Your OS dictation works too, for short captures.
Last updated: June 2026

Voice to text in Joplin on desktop works through a system-wide tool, not Joplin itself. Joplin's built-in voice typing is an Android-only feature; the desktop app has none. The fix is a tool like Whisper: press a hotkey, speak, and the transcript pastes at the cursor in any Joplin note. The operating system's dictation works too, for short notes.
I keep my notes in Joplin because I trust a folder of plain markdown that syncs to storage I control more than I trust someone else's cloud. The one thing I kept reaching for was a way to talk into a note instead of typing it. So I went hunting for the setting on my laptop. There is no setting. The Joplin desktop app has no microphone button, and after a fair bit of digging I'm confident it isn't hiding one.
Here's the part that trips people up. Joplin does have built-in voice typing — it's just on Android. Its own developer docs say so plainly. People hear "Joplin has voice typing," go looking on their desktop, find nothing, and assume they missed a toggle. They didn't. The toggle is on the phone. The good news: the desktop fix takes about two minutes, runs fully offline if you want, and works in every other app you open as a bonus.
Here's the thing most pages dancing around this keyword won't say plainly. A Joplin note is just a markdown text box, the same as Gmail or a search bar. Dictation that pastes at your cursor doesn't care which app the cursor is in.
So the real question isn't "how do I turn on voice typing in Joplin on my laptop." There's no switch on the desktop. The question is "which dictation tool do I run on top of Joplin," and the answer depends on whether you want free-and-built-in, OS-level, or one offline hotkey that behaves the same everywhere. I'll walk all of it, set one up in two minutes, and tell you when to skip the dedicated route.
Does Joplin have built-in dictation?

On desktop, no. The Joplin desktop app for Windows, Mac, and Linux has no built-in speech-to-text, dictation, or voice-typing feature for writing into a note by voice. There is no microphone button, no voice command, no hidden preference. If you've been combing Settings for it, you can stop. It isn't there.
What does exist — and this is where everyone gets turned around — is voice typing on Joplin's Android app. Joplin's own developer docs state it directly: the Android mobile application supports built-in, offline voice typing, by default through Whisper. The team has put real work into it, adding automatic punctuation and a custom glossary. It's a genuinely good feature. It just lives on the phone. Conflating "Joplin has voice typing" with "Joplin has voice typing on my laptop" costs an afternoon, and I'd rather you skip that afternoon.
So the mobile picture is settled: on Android you've got it built in, on iPhone you'd lean on the keyboard's microphone, and either way it's a phone feature. On the desktop note most people actually live in, you need a tool that sits on top of Joplin. There are a couple of honest routes, and the rest of this guide covers them.
Press a hotkey, talk, text lands in the note
This is the whole mechanic, and it's boring in the best way. You press a hotkey, you speak, you release, and the transcript pastes at your cursor, in whatever text field has focus. Whisper holds a short tail after you let go of the key, so your last word doesn't get clipped. Because it pastes at the OS cursor, a Joplin note is just "any text box." Markdown editor, rich-text editor, the search bar — same behaviour.
That's the part the landing pages overcomplicate. There's no plugin to install into Joplin, no API token to paste, no sync job to babysit. Your cursor is in a note, you talk, the words appear in the note. A small capsule shows up while you speak so you know it's listening:
The hotkey is the one thing worth getting right up front. On Windows it's Ctrl+Space; on Mac it's Command+Option, a modifier-only push-to-talk you hold while speaking. Both are changeable in Settings if they clash with something you already use. (My younger daughter once told me a hotkey "didn't work" in her drawing app. It was a conflict, not a bug, which is how I learned the average person has no idea what a hotkey conflict even is. So now every hotkey is customisable.) If you've ever set up dictation on Mac, this is the same muscle memory pointed at a different app.
Set it up in two minutes (Windows or Mac)
You need a Mac on Apple Silicon or a Windows 10-or-newer PC, a working microphone, and Joplin open on your desktop. The whole local pipeline is free for any signed-in account, with no payment method asked for at sign-up. Here's the sequence.
Step 1 — Install Whisper and sign in.
Download from the download page, install, and create a free account. No card. The whole local transcription pipeline opens right away.
You'll know it worked when the app's tray icon appears and the setup wizard offers to pick a model.
Step 2 — Pick a transcription path.
The app doesn't choose for you. You get three: Cloud (OpenAI, bring your own key), Local Parakeet, or Local Whisper. For private notes you keep in plain markdown, start local — more on that two sections down.
You'll know it worked when a model finishes downloading and shows as ready.
Step 3 — Confirm your hotkey.
Windows defaults to Ctrl+Space, Mac to Command+Option held as push-to-talk. On Mac, grant the Accessibility permission when prompted; without it, the paste-at-cursor can't reach other apps.
You'll know it worked when a test recording pastes into any text field.
Step 4 — Put your cursor in a Joplin note and talk.
Open Joplin, click into the body of a note, hold the hotkey, say a sentence, release. The transcript appears where the cursor is, in the note.
You'll know it worked when your spoken sentence is sitting in the Joplin note as text.
The slow part is the model download, not the setup. Everything else is the four steps above. Once it's running, the act of capturing a thought into a note stops being a typing task and starts being a talking task.
Desktop vs. mobile: where Joplin's voice typing actually lives
This is worth pinning down because it's the source of nearly every "why can't I find it" question. Joplin's built-in voice typing is an Android feature. The docs are explicit: the Android app does offline voice typing through Whisper, with punctuation and a glossary. On the desktop app, that feature does not exist. Same product, same notes, two very different capabilities depending on which device you opened.
So if you mostly capture on your phone, you may not need anything extra — the Android voice typing is right there in the note editor. The gap is the laptop, where most longer writing happens and where Joplin gives you nothing. A system-wide hotkey closes that gap. It pastes at the OS cursor regardless of which window owns it, so the same key that fills a Joplin note also fills your Gmail compose box, a Slack message, and a commit message. One tool, every text field, on both Windows and Mac.
There's a tidiness to keeping the same flow across devices, too. On the phone you use Joplin's own voice typing; on the desktop you use the hotkey. Both put words into the same markdown note. You don't relearn anything when you switch machines, and the desktop tool doesn't care that it's Joplin specifically — which means it also covers every other program you write in. I'd reach for the one hotkey because I switch apps roughly forty times an hour and don't want forty different dictation buttons to remember.
Local or cloud: which mode for a private vault
For Joplin, try local mode first. The whole reason a lot of us chose Joplin is that it's local-first plain markdown that you sync to storage you control — a meeting recap, a half-formed idea, a journal entry you'd never want on someone's server. It would be a strange choice to keep your notes in a folder you own and then route your voice through a cloud to get there. If your Mac is Apple Silicon or your PC is from the last few years, local handles everyday dictation without complaint, and cloud becomes the escape hatch rather than the default.
Here's how the three paths differ, because the app makes you pick and I'd rather you pick well:
- Local Parakeet — NVIDIA's TDT engine, around 600 MB, and the fastest local option — 5 to 10 times faster than Whisper on CPU. Covers English plus 24 other European languages, 25 in total. No translate-to-English. If you write notes in English or another European language, this is the quick, fully offline pick.
- Local Whisper — slower than Parakeet on the same machine, but the multilingual builds cover 99 languages and can translate to English. The English-only builds are English-only, not 99. Pick this for Chinese, Japanese, Korean, or any translation work, which Parakeet can't do. Default English model is around 480 MB.
- Cloud (OpenAI, BYOK) — best accuracy and web access, using your own OpenAI key billed straight by OpenAI. Transcription runs on gpt-4o-mini-transcribe by default. Needs internet, so it's the one path that leaves your machine. The Cloud surface is part of Whisper Pro.
The boring truth is that for the kind of text most people put in Joplin, local is plenty. Both local engines run fully on your machine with nothing sent to a server, which matches the spirit of a notes app you chose precisely because the data stays yours. It also lines up with Joplin's own Android voice typing, which the team kept entirely offline for the same reason. Cloud earns its place when you want top-tier accuracy on a hard recording or you need the model to pull a fact off the web mid-sentence. For a daily-notes habit, start local and only reach for cloud when local leaves you wanting.
Punctuation, markdown, and cleanup by voice
Raw dictation comes out as a run-on. You say "okay so write up the architecture review note tag it project alpha and remind me Thursday," and that's the unpunctuated wall any speech engine hands you. Cleaning it up is where the paths diverge.
Windows Voice Typing adds punctuation as you speak, and macOS Dictation handles basic punctuation when you say "comma" or "period." For heavier cleanup — stripping the "ums," fixing the run-ons, turning a spoken paragraph into something you'd actually keep in a note — Whisper can run an AI pass. Say the activation phrase "Hey whisper" and the text gets enhanced before it lands. On a local model that runs through Ollama; in cloud mode it's gpt-5-mini by default.
okay so write up the architecture review note tag it project alpha and remind me thursday um before the standup
Okay, so write up the architecture review note, tag it Project Alpha, and remind me Thursday before the standup.
For Joplin's own markdown — headings, bullet lists, checkboxes, the [[note]] internal links — the honest answer is that voice gets you the text and Joplin's markdown shortcuts get you the structure. Dictate the sentence, then type the # for a heading, the - for a bullet, or - [ ] for a checkbox the way you always do. No dictation tool conjures markdown syntax into existence on command; anyone promising "say heading project alpha and watch it format" is selling you a demo, not a Tuesday. Get the words down fast by voice, shape the markdown with the keys you already know.
That same speak-then-clean flow pays off well beyond your notes — you can also dictate clean prose into any app with the one hotkey, so a long note becomes a few spoken sentences instead of a paragraph you type out.
When to skip a dictation tool for Joplin

Sometimes the right tool is the free one already on your machine, and pretending otherwise would be dishonest. If you only drop short captures into Joplin — a quick line, a two-word reminder — and you're on your phone, Joplin's own Android voice typing already covers it for nothing. On the desktop, your operating system does the same.
On Windows, press Windows key + H and the built-in Voice Typing bar opens wherever your cursor is, a Joplin note included. It punctuates on its own and is fine for short bursts. The catch: it routes through Microsoft's servers and needs an internet connection, so it isn't an offline option, which matters more than usual when the whole point of your notes is staying local. On Mac, Dictation lets you speak to enter text anywhere you can type, set up in System Settings under Keyboard, and on Apple Silicon general text can be processed on-device.
Reach for a dedicated, system-wide tool when the built-ins start hurting: long notes, multilingual work, offline privacy on Windows, or wanting one hotkey that behaves the same in Joplin, your email, and your editor. Below that bar, use what's free — the OS on desktop, Joplin's own voice typing on Android. I'm not going to tell you to install an app for a one-line reminder.
The same trade-off shows up if you also keep notes elsewhere — the logic in dictating into Obsidian is identical, because both are local-first markdown apps where the cursor, not a plugin, is the real integration.
Joplin shipped a microphone button — on Android, not on my laptop, and after writing this I doubt the desktop one is coming soon. It doesn't need to, because on the desktop the cursor is the integration. Talk into the note, get text, shape it with the markdown shortcuts you already know. I dictated most of this guide into a text box that wasn't Joplin, with a tool that doesn't care which box it is, then pasted the lot into my own note. That's the whole trick.
Try it in your next Joplin note
Hold the hotkey, talk, release. The transcript lands in whatever note your cursor is in — and in every other app too.
Free local mode for any signed-in account. No card required to start.



