Guide
AI meeting notes tools, honestly
The category covers two different products, and the marketing pages blur the line. One sends a bot into your call. The other never touches it. Here's how to tell which one you actually need.
Last updated: June 2026

An AI meeting notes tool records a conversation and turns it into a transcript, summary, and action items. Most are auto-join bots that enter a Zoom, Teams, or Meet call and write the notes for the whole room. A second kind is bot-free: you dictate the notes yourself, and nothing joins the call. Pick the bots (Otter, Fireflies, Granola, Read AI, tl;dv) when you want a robot to attend and recap; pick a dictation tool like Whisper when you want your own clean notes by voice without recording the room.
A one-hour meeting runs to roughly 9,000 spoken words, as a back-of-the-envelope rule of thumb. Nobody types those by hand, which is why a whole industry of tools now joins your calls uninvited and writes the notes for you. The boring truth: "AI meeting notes tool" covers two different products, and the marketing pages do their best to blur the line. One sends a bot into the room. The other never touches the call. Pick the wrong one and you either pay for a feature you do not want, or send a recording of your whole meeting somewhere you did not mean to.
Here is the verdict up front, since this is a comparison. Want a robot to attend your Zoom and hand everyone a recap afterward? The auto-join bots (Otter, Fireflies, Granola, Read AI, tl;dv) are built for that, and several have free tiers. Want to capture your own clean notes by voice, during or right after the call, without a third party recording the room? That is a different tool, and it is where Whisper fits. This article sorts the category into those two kinds, names the bots worth knowing, and tells you which one I would reach for in each case. Most of the support email I read comes from people who bought the wrong kind on day one, so I have a stake in getting the distinction right.
I make Whisper, so let me be straight about its place before we go further. Whisper is not a meeting bot. It does not join your call. It is a dictation tool: you press a hotkey, speak, and the text lands wherever your cursor is. That makes it the wrong tool for "transcribe a four-person standup I am only half-attending," and the right tool for "write my own notes and action items by voice without uploading the whole meeting." Both jobs are real. Most articles pretend there is only one.
An AI meeting notes tool turns talk into notes you can act on

Strip away the marketing and every tool in this category does three things. It captures audio. It transcribes that audio into text. Then a language model compresses the text into a summary, a list of decisions, and a set of action items. The AI Overview Google shows for this search says the same idea in more words.
What separates the tools is the capture step, and that is the part the homepages gloss over. An auto-join bot captures by sending a participant into your video call. You have seen it: the extra attendee named "Otter.ai" or "Fireflies Notetaker" sitting in the grid. A bot-free desktop tool captures the audio playing through your computer instead, so no extra guest appears in the room. A dictation tool like Whisper captures only what you say into the mic when you hold a hotkey. Same category on paper. Three different things in the room.
The reason this matters is consent and privacy, not features. When a bot joins, a third party records everyone on the call, often without a clear heads-up. When you dictate your own notes, the only voice captured is yours, and in local mode nothing leaves your machine. We will come back to that. First, how the bots work.
How the auto-join note takers work
The auto-join tools live in your calendar. You connect Google Calendar or Outlook, and a few minutes before each meeting starts, the tool dispatches a bot that asks to join the call as a participant. Fireflies, for example, can "autojoin your calendar meetings," or you can invite its bot to a live meeting on the fly. Read AI's pitch is that it "joins your meetings, records, and delivers a recap" across Zoom, Google Meet, and Microsoft Teams.
Once inside, the bot records the call, transcribes every speaker, and after the meeting writes a summary with action items that it emails around or drops into a workspace. Some of these tools then let you search across every past meeting and "ask" questions about what was said.
A quieter second method has grown up over the last year: bot-free capture. Instead of a bot joining the call, a desktop app records the audio coming out of your computer's own speakers. Granola "transcribes your computer's audio directly, with no meeting bots joining your call." Fathom now offers both, bot or no bot, so you can "stay focused on the meeting" either way. tl;dv markets a "NO BOT REQUIRED" flow on its free plan. The notes still cover the whole meeting; the difference is whether a visible guest shows up in the grid. The recording still happens. It just happens on your laptop instead of in the cloud.
The two kinds of tool nobody tells you apart
Here is the distinction the category pages refuse to draw. There is the tool that listens to the room, and there is the tool that listens to you.
The room tools, bot or bot-free, capture the whole conversation, every speaker, on their own. You sit back and the notes appear. That earns its keep when you are in a meeting you cannot fully attend, or when the whole team needs a shared record. It is also a recording of other people, which carries consent and storage questions you now own.
The "you" tool captures only what you choose to say. You hold a hotkey, dictate the three decisions that mattered, and the cleaned-up text lands in your notes app or your email. Nobody else is recorded. Nothing of the meeting exists except the summary you spoke on purpose. The work is slower in the sense that you have to do the thinking, but the thinking is the point. A 600-word summary you dictated is worth more than a 9,000-word transcript nobody reads.
Most people searching "AI meeting notes tool" assume they want the room tool. About half of them, once they think about it, want the "you" tool. They never knew it was a category. That is the entire reason this article exists.
How I picked the tools in this comparison
A quick note on method, because the honest version matters here. I did not run a lab. I have not sat seven of these bots side by side in the same Zoom call and timed them, and any article that claims it did, without showing the recording, is guessing. So I am not going to invent accuracy percentages or speed numbers for tools I do not build. What follows is built from two things: each tool's own documented capabilities (the claims on their pricing and product pages, cited inline), plus hands-on use of the one app I do build and run every day.
The qualities I weighed, in the order they tend to matter for this category:
- What it captures. The whole room (auto-join or bot-free) versus only your own voice. This is the fork everything else hangs off.
- Who gets recorded. A bot in the call records every participant; dictation records you alone. That decides the consent and privacy story.
- Where the audio goes. Cloud service versus on your own machine. Most bots are cloud-only; one local mode is the exception.
- Platform reach. Which call apps it joins (Zoom, Meet, Teams) and which operating systems run it.
- Language coverage. Primary-sourced from each vendor's page. Where a vendor states no number, I say so rather than guess.
- Cost shape. Free tier, per-seat subscription, or freemium. Real dollar figures live on each tool's own page and the pricing page; I am not quoting them mid-sentence here.
Those are selection criteria, not a verdict dressed up as one. With them stated, here is the category in one table.
The tools at a glance
Every cell below comes from each tool's own documented claims (cited inline in the next section) or, for Whisper, from how the app ships. No accuracy or speed numbers appear, because no vendor here publishes verified benchmarks and I will not invent them.
| Tool | Platforms it joins | Local or cloud | Works offline | Pricing shape | Languages (stated) | Best for |
|---|---|---|---|---|---|---|
| Otter.ai | Zoom, Teams, Meet | Cloud | No | Free tier + per-seat | 6 | Zoom-heavy teams in one of its languages |
| Fireflies.ai | Zoom, Meet, Teams, +more | Cloud | No | Free-forever + per-seat | 100+ | A searchable archive of every call |
| Granola | Zoom, Meet, Webex, Slack, Teams | Cloud (bot-free local capture) | No | Freemium | Not stated | Whole-meeting notes with no visible bot |
| Read AI | Zoom, Meet, Teams | Cloud | No | Free tier + paid | 20+ | Trying the idea on a free no-card tier |
| tl;dv | Meet, Zoom, Teams | Cloud (no-bot option) | No | Free-forever + paid | 30+ | The most generous free plan |
| Notion AI Meeting Notes | Not stated | Cloud | No | Bundled into Notion paid plans | Not stated | Teams already living in Notion |
| Whisper by Remskill | Joins nothing (you dictate) | Local or cloud (your choice) | Yes (local mode) | Free local tier + Pro for cloud | 99 (multilingual local) | Writing your own notes by voice, privately |
Read the table as a sorting tool, not a scoreboard. The first six rows are the room. The last row is you. Pick your row and the rest of this article tells you which name on it to reach for.
The meeting bots worth knowing
If the room tool is what you need, here are the five worth your time. All claims below come from each tool's own pages.
Otter.ai is the default name in this space. It joins Zoom, Microsoft Teams, and Google Meet to write and share notes, with live transcription and captions. Its free Basic plan gives you 300 transcription minutes a month. The catch worth knowing: its pricing page lists six languages, namely English, Spanish, French, German, Japanese, and Chinese. Outside those, look elsewhere. If you are weighing Otter against dictation, the Otter.ai alternative writeup goes deeper on the meeting-versus-writing split.
Fireflies.ai auto-joins calendar meetings on Zoom, Meet, Teams and more, and advertises transcription in 100+ languages. It has a free-forever tier with 800 minutes of storage per seat. It leans toward teams that want a searchable archive of every call.
Granola is the bot-free one. It records your computer's audio, no bot in the grid, and works alongside Zoom, Meet, Webex, Slack, and Teams with an iPhone app. If the visible-bot awkwardness is your objection, Granola removes it while still capturing the whole meeting.
Read AI ranked first in the search results I looked at. It joins, records, and delivers a recap across Zoom, Meet, and Teams, with apps on Windows, macOS, Android, iPhone, and Chrome, plus 20+ languages. Its free tier offers 5 meetings a month with no credit card.
tl;dv is the aggressive free option. Its Free Forever plan advertises unlimited recordings and transcriptions in 30+ languages with AI summaries, and integrates with Meet, Zoom, and Teams. If "free" is your only hard requirement, start here.
One more worth a mention: Notion AI Meeting Notes transcribes and summarizes inside Notion without a separate bot, bundled into Notion's Business and Enterprise plans. Worth knowing if you already live in Notion, though its page does not state which call platforms it joins or how many languages it covers.
Where Whisper fits: you dictate, it types, nothing joins the call
Whisper is the "you" tool. You press a hotkey (Ctrl+Space on Windows, or hold Command+Option on macOS as a push-to-talk chord), then speak, and your words paste in as text wherever the cursor is, in any app. No bot joins your call. No extra guest appears in the Zoom grid. The only audio captured is what you say into your own mic, on purpose.
That changes the workflow. Instead of a 9,000-word transcript of the whole meeting, you dictate the part that matters (the three decisions, the two owners, the one deadline) straight into your notes doc while the call is still fresh. In cloud mode the AI assistant can clean it up, summarize a paragraph, extract action items from what you dictated, or draft the follow-up email, pasted at the cursor. Say "Hey whisper" before your instruction to trigger the AI step.
You pick the engine. Local Whisper runs eight models on your machine and covers 99 languages on its multilingual variants; the English-only .en builds handle English alone. NVIDIA Parakeet is the fastest local option, 5–10× faster than Whisper on CPU, covering 25 languages (English plus 24 European), all on-device. Or cloud mode uses your own OpenAI key: gpt-4o-mini-transcribe or gpt-4o-transcribe for transcription, gpt-5-mini for enhancement. In local mode the whole thing runs offline after a one-time model download; nothing is sent anywhere during transcription. If running everything on your own machine is the part that matters to you, the offline speech-to-text guide walks through the local engines in more depth. Whisper ships on Windows and macOS (Apple Silicon); Linux is not supported.
The honest tradeoff: the room tools save you attention, and Whisper saves you a recording you did not need. If you mostly write things (emails, docs, notes after a call), the dictation workflow earns its place in your day far beyond meetings. That is the same case I make in the broader voice-to-text app writeup, and it is why Whisper is built around dictation first and meetings second.
Now the one opinion I hold strongly here, said plainly: cloud-only dictation is a privacy disaster waiting to be transcribed. A team I worked with once let a contractor build an internal "AI dictation" prototype that called a cloud model on every utterance and ran on every laptop. The manager opened the cost dashboard at quarter-end and found a five-figure bill, most of it from one team transcribing standup recordings four times over because the "smart retry" logic was too aggressive. The contractor said "we should optimize the prompt." The CFO said "or we should not pay to upload meetings that already have notes." The room got very quiet. Your boss's salary numbers, the email to your kid's school, the legal draft you are dictating: none of that needs to live in a vendor's logs because you wanted to type with your voice. In Whisper's local mode, your audio is processed on your computer and nothing is sent to any server, ever.
When a meeting bot beats Whisper
Now the honest part. If your actual job is "capture a four-person call I can barely attend, and email everyone the recap," Whisper is the wrong tool and I would not sell it to you. You want a room tool. Reach for Otter if you live in Zoom and speak one of its six languages, or tl;dv if you want unlimited recordings on a free plan in 30+ languages, or Granola if you want the whole meeting captured without a visible bot in the grid. Read AI's free 5-meetings-a-month tier is a fine way to test the idea with no card.
There is one more case worth naming: if the value you want is the bot joining and summarizing on its own while you focus elsewhere, a dedicated meeting-notes bot beats dictation outright, because auto-join plus auto-summarize is exactly the thing it does and the thing Whisper deliberately does not. Whisper does not join calls, does not transcribe other speakers, and does not produce a multi-speaker recap. It captures what you say. If you need the room, use the room tool. I would rather lose the sale than read the refund email.
Pricing
Whisper is free for everyone for the entire local pipeline: local transcription, AI enhancement through Ollama, history, presets, custom hotkey, with no payment method at signup. Whisper Pro adds the cloud surface: OpenAI cloud transcription, cloud AI enhancement, and voice web search. The full numbers live on the pricing page. The bots price per seat: Otter's free Basic gives 300 minutes a month, Fireflies is free-forever with 800 minutes of storage per seat, tl;dv's Free Forever plan is unlimited, and Read AI's free tier is 5 meetings a month. If "free for personal use" is the bar, almost everything here clears it.
If you take one thing from all this: decide whether you need the room or just your own voice before you sign up for anything. The bots are good at being the room, so let them. But the next time you finish a call and reach for the keyboard to write the three things that mattered, try holding a hotkey and saying them instead. The notes get written in the time it takes the bot to email its recap, and the only person on the recording is you. My younger daughter worked that out faster than I did. She does not have any meetings yet.
Need the room, or just your own voice?
If it's your own notes you're after, download Whisper, hold the hotkey, and dictate the three things that mattered. The local pipeline is free, no card at signup.
Free local dictation for every signed-in user. Pro adds the cloud features on a separate trial.



