Tutorial
Voice to text in Airtable
Airtable has no built-in dictation in the desktop app, the browser, or the mobile app. The fix is a system-wide tool: press a hotkey, speak, and the transcript pastes at your cursor in any Airtable cell, long-text field, or comment. Your OS dictation works too, for short captures.
Last updated: June 2026

Voice to text in Airtable works through a system-wide tool, not Airtable itself. Airtable has no built-in dictation in the browser, the desktop app, or mobile. The fix is a tool like Whisper: press a hotkey, speak, and the transcript pastes at the cursor in any cell, long-text field, or comment. The operating system's dictation works too, for short entries.
I run a base that tracks every release of the app — a row per version, a long-text field for the changelog, a comment thread where I argue with myself about scope. For the longest time I typed every word of it. Then I went looking for a microphone button on the long-text field, because surely a tool this good at structuring data lets you talk into it. There is no microphone button. After a fair bit of digging, I'm confident Airtable isn't hiding one.
People search for "voice to text in Airtable," find nothing in the app, and assume they missed a toggle. They didn't. The toggle was never built. The good news is the fix takes about two minutes, runs fully offline if you want it to, and works in every other app you open as a bonus.
Here's the thing most pages dancing around this keyword won't say plainly. An Airtable cell is just a text box, the same as Gmail or a search bar. A long-text field and a record comment are bigger text boxes. Dictation that pastes at your cursor doesn't care which box the cursor is in.
So the real question isn't "how do I turn on voice typing in Airtable." There's no switch. The question is "which dictation tool do I run on top of Airtable," and the answer depends on whether you want free-and-built-in, browser-only, or one offline hotkey that behaves the same everywhere. I'll walk all of it, set one up in two minutes, and tell you when to skip the dedicated route.
Does Airtable have built-in dictation?

No. Airtable has no built-in speech-to-text, dictation, or voice-typing feature for entering text into a cell, a long-text field, or a comment by voice. Not in the browser, not in the desktop app, not on mobile. There is no microphone button on a field, no voice command, no hidden preference. If you've been combing Settings for it, you can stop. It isn't there.
What does exist is audio transcription, and this is where people get turned around. You can upload an audio file to an attachment field and run it through Airtable AI or an automation to get a transcript back. That's useful, but it's not live dictation. You can't put your cursor in a cell, talk, and watch the words appear. Those workflows process a recorded file after the fact; they don't type for you while you think. Conflating the two costs an afternoon wiring up an automation that solves a different problem, and I'd rather you skip that afternoon.
Even the third-party tools built around this admit it plainly. The browser extensions and dictation apps that rank for "Airtable voice typing" open by saying Airtable has no native voice input, then offer to bolt one on from outside. They're right about the diagnosis. On the desktop app and the browser grid most people actually live in, you need a tool that sits on top of Airtable. There are three honest categories, and the rest of this guide covers them.
Press a hotkey, talk, text lands in the cell
This is the whole mechanic, and it's boring in the best way. You press a hotkey, you speak, you release, and the transcript pastes at your cursor, in whatever text field has focus. Whisper holds a short tail after you let go of the key, so your last word doesn't get clipped. Because it pastes at the OS cursor, an Airtable cell is just "any text box." Desktop app or the browser version, same behaviour.
That's the part the landing pages overcomplicate. There's no Airtable integration to authorise, no API key to paste, no automation to babysit. Your cursor is in a cell, you talk, the words appear in the cell. A small capsule shows up while you speak so you know it's listening:
The hotkey is the one thing worth getting right up front. On Windows it's Ctrl+Space; on Mac it's Command+Option, a modifier-only push-to-talk you hold while speaking. Both are changeable in Settings if they clash with something you already use. (My younger daughter once told me a hotkey "didn't work" in her drawing app. It was a conflict, not a bug, which is how I learned the average person has no idea what a hotkey conflict even is. So now every hotkey is customisable.) If you've ever set up dictation on Windows or on Mac, this is the same muscle memory pointed at a different app.
Set it up in two minutes (Windows or Mac)
You need a Mac on Apple Silicon or a Windows 10-or-newer PC, a working microphone, and Airtable open in either the desktop app or the browser. The whole local pipeline is free for any signed-in account, with no payment method asked for at sign-up. Here's the sequence.
Step 1 — Install Whisper and sign in.
Download from the download page, install, and create a free account. No card. The whole local transcription pipeline opens right away.
You'll know it worked when the app's tray icon appears and the setup wizard offers to pick a model.
Step 2 — Pick a transcription path.
The app doesn't choose for you. You get three: Cloud (OpenAI, bring your own key), Local Parakeet, or Local Whisper. For everyday base entries, start local — more on that two sections down.
You'll know it worked when a model finishes downloading and shows as ready.
Step 3 — Confirm your hotkey.
Windows defaults to Ctrl+Space, Mac to Command+Option held as push-to-talk. On Mac, grant the Accessibility permission when prompted; without it, the paste-at-cursor can't reach other apps.
You'll know it worked when a test recording pastes into any text field.
Step 4 — Put your cursor in an Airtable field and talk.
Open your base, click into a cell, long-text field, or comment box, hold the hotkey, say a sentence, release. The transcript appears where the cursor is.
You'll know it worked when your spoken sentence is sitting in the Airtable field as text.
The slow part is the model download, not the setup. Everything else is the four steps above. Once it's running, filling a long-text field stops being a typing task and starts being a talking task.
Short cells, long-text fields, and comments
Airtable isn't one kind of text box, it's three, and dictation handles all of them the same way because the cursor is the cursor. A single-line cell takes a quick spoken value — a name, a status, a title. A long-text field takes a paragraph, which is exactly where voice earns its keep, because a paragraph is where typing slows you down. A record comment takes a sentence you'd otherwise tap out one-handed while reading the row. Same hotkey, three different boxes.
Most pages ranking for this keyword point you at a browser extension — Voice In, Voicy, and the like — that adds dictation to any text field on a web page, Airtable included. Extensions are a fine answer if you live inside a browser tab. They have one structural limit: they only work where the browser reaches. The Airtable desktop app is not a browser tab, so a Chrome extension can't see it. A system-wide hotkey can, because it pastes at the OS cursor regardless of which window owns it.
That's the real split. An extension is browser-scoped; a hotkey is everything-scoped. The same key that fills an Airtable long-text field also fills your Gmail compose box, a Slack message, and a commit message. One tool, every text field, on both Windows and Mac. If you only ever touch Airtable in a Chrome tab, an extension is enough, and several are free. The moment you open the desktop app, or want the same flow across every program, the system-wide route wins. I'd reach for the one hotkey because I switch apps roughly forty times an hour and don't want forty different dictation buttons to remember.
Local or cloud: which mode for your base
For Airtable, try local mode first. A lot of what goes into a base is the kind of thing you'd rather not route through a vendor's logs — client notes, a pricing column, an internal roadmap, a comment about a teammate's idea. If your Mac is Apple Silicon or your PC is from the last few years, local handles everyday dictation without complaint, and cloud becomes the escape hatch rather than the default.
Here's how the three paths differ, because the app makes you pick and I'd rather you pick well:
- Local Parakeet — NVIDIA's TDT engine, around 600 MB, and the fastest local option — 5 to 10 times faster than Whisper on CPU. Covers English plus 24 other European languages, 25 in total. No translate-to-English. If you fill your base in English or another European language, this is the quick, fully offline pick.
- Local Whisper — slower than Parakeet on the same machine, but the multilingual builds cover 99 languages and can translate to English. The English-only builds are English-only, not 99. Pick this for Chinese, Japanese, Korean, or any translation work, which Parakeet can't do. The default English model is around 480 MB.
- Cloud (OpenAI, BYOK) — best accuracy and web access, using your own OpenAI key billed straight by OpenAI. Transcription runs on gpt-4o-mini-transcribe by default. Needs internet, so it's the one path that leaves your machine. The Cloud surface is part of Whisper Pro.
The boring truth is that for the kind of text most people put in a base, local is plenty. Both local engines run fully on your machine with nothing sent to a server. Cloud earns its place when you want top-tier accuracy on a hard recording or you need the model to pull a fact off the web mid-sentence. For daily data entry, start local and only reach for cloud when local leaves you wanting.
Punctuation, cleanup, and Airtable structure by voice
Raw dictation comes out as a run-on. You say "okay so set the status to in review assign it to maria and note the budget is over by about twelve percent," and that's the unpunctuated wall any speech engine hands you. Cleaning it up is where the paths diverge.
Windows Voice Typing adds punctuation as you speak, and macOS Dictation handles basic punctuation when you say "comma" or "period." For heavier cleanup — stripping the "ums," fixing the run-ons, turning a spoken paragraph into something you'd actually keep in a long-text field — Whisper can run an AI pass. Say the activation phrase "Hey whisper" and the text gets enhanced before it lands. On a local model that runs through Ollama; in cloud mode it's gpt-5-mini by default.
okay so set the status to in review assign it to maria and note the budget is over by about twelve percent um before the sprint ends
Okay, so set the status to In Review, assign it to Maria, and note the budget is over by about twelve percent before the sprint ends.
For Airtable's own structure — picking a value from a single-select, linking a record, setting a date field — the honest answer is that voice gets you the text and Airtable's own interface gets you the structure. Dictate the long-text field, then click the dropdown for the single-select or type the linked-record name the way you always do. No dictation tool conjures Airtable's field types into existence on command; anyone promising "say status in review and watch it pick the option" is selling you a demo, not a Tuesday. Get the words down fast by voice, shape the record with the controls you already know.
That same speak-then-clean flow pays off well beyond your base — you can also dictate clean prose into any app with the one hotkey, so a long comment becomes a few spoken sentences instead of a paragraph you type out.
When to skip a dictation tool for Airtable

Sometimes the right tool is the free one already on your machine, and pretending otherwise would be dishonest. If you only drop short values into a base — a status, a name, a two-word tag — your operating system covers it for nothing.
On Windows, press Windows key + H and the built-in Voice Typing bar opens wherever your cursor is, an Airtable cell included. It punctuates on its own and is fine for short bursts. The catch: it routes through Microsoft's servers and needs an internet connection, so it isn't an offline option, which matters when a column holds anything you'd rather keep private. On Mac, Dictation lets you speak to enter text anywhere you can type, set up in System Settings under Keyboard, and on Apple Silicon general text can be processed on-device. For a quick single-line cell, either built-in is the sensible call.
Reach for a dedicated, system-wide tool when the built-ins start hurting: long-text fields, multilingual entries, offline privacy on Windows, or wanting one hotkey that behaves the same in Airtable, your email, and your editor. Below that bar, use what's free. I'm not going to tell you to install an app to dictate one status field.
The same trade-off shows up if your work also lives in a tracker — the logic in dictating into Jira is identical, because both are field-and-comment tools where the cursor, not an integration, is the real connection.
Airtable never shipped a microphone button, and after writing this I'm fairly sure it never will. It doesn't need to, because the cursor is the integration. Talk into the cell, get text, shape the record with the controls you already know. I dictated most of this guide into a text box that wasn't Airtable, with a tool that doesn't care which box it is, then pasted the lot into the long-text field where I keep my drafts. That's the whole trick.
Try it in your next Airtable field
Hold the hotkey, talk, release. The transcript lands in whatever cell, long-text field, or comment your cursor is in — and in every other app too.
Free local mode for any signed-in account. No card required to start.



