Granola and Pocket, the AI meeting tools that run our agency

TLDR

Granola and Pocket aren't competitors. Granola handles the meetings we take from a desk, Pocket handles every conversation that happens off the laptop, and together they cover almost every conversation context an agency operator runs into.

  • Granola joins Zoom, Google Meet, and Microsoft Teams calls through your computer's audio. No bot in the room.
  • Pocket is a hardware microphone that captures phone calls, in-person meetings, and voice memos.
  • Both ship MCP servers, so Claude Code can search every transcript without us opening either app.
Skip ahead Get Pocket

Disclosure. The Pocket links in this post are affiliate links. If you buy through them we earn a small commission at no extra cost to you. Granola links are not affiliated and we get nothing from those.

Most agency tooling feels like it was built for a different decade. Notetakers that don't actually take notes. Recording apps that ask for a second person to manage them. Transcription services that want a Zoom meeting and nothing else, even though half the conversations a working agency runs on happen on a phone or in a car.

That gap closed when we started running Granola and Pocket together. Granola handles the scheduled meetings we take from a desk. Pocket handles everything else. They cover different ground, the overlap is small, and the result is a working memory across every conversation we have at Market Correct, with full transcripts and AI summaries waiting in two apps by the time the laptop closes.

This isn't a feature list rewritten from a marketing site. It's how the two tools are actually used in a working agency day, the spots where each one earns its place, and the spots where each one doesn't.

Why we run both

The simplest framing is that an agency day has two kinds of conversations, and neither tool covers both. There are calendar conversations, the Zoom call at 10, the Google Meet at 2, the Microsoft Teams check-in on Thursday. Then there are everything-else conversations, the phone call from the car, the prospect coffee, the in-person creative review, the on-set conversation with a producer.

Granola is built around the first category. It listens to your computer's audio when a calendar event fires. The notes appear inside the app a moment after the meeting ends. Pocket is built around the second. The hardware microphone captures audio in the world around you, regardless of whether a calendar event exists or a video call is happening. Two tools, two contexts, almost no overlap.

If we only ran one, the working memory would be selective. If only Granola, every phone call and in-person meeting would live in nobody's notes. If only Pocket, every Zoom call would either need to be recorded another way or live in someone's head. Together, the gap closes.

Worth saying out loud, these tools are essentially doing the same job, just from two different angles. Both record. Both transcribe. Both produce AI summaries. The difference is mostly about how they capture audio and where they sit in the day. The other reason we keep both running, even when contexts overlap, is that two tools mean two slightly different reads on the same conversation. Different transcript styles. Different summary structures. Different model behavior on the AI write-up. When the conversation matters, having a second read on what was said is its own kind of insurance.

What Granola does

Granola is an AI notetaker that joins your scheduled meetings without joining your scheduled meetings. That's not a typo. The tool runs on your laptop and listens to the audio your computer is already producing during the call. Nothing shows up in the participant list. There's no bot icon staring at the host. The conversation happens, and the notes appear.

The product is sold and supported at granola.ai. It supports the meeting platforms most agencies actually use, Zoom, Google Meet, and Microsoft Teams, plus a handful of others. We've used it across all three.

Practically, the workflow looks like this. A meeting starts. Granola is open in the background. The conversation runs as it normally would. The moment the call ends, an AI-generated summary appears, structured according to the template you've set up. Action items go in one section, decisions in another, open questions in a third. The transcript sits behind the summary, searchable, ready to copy.

For an agency that runs Google Ads reviews, programmatic strategy calls, and paid social kickoffs, that's the difference between leaving a meeting with what's in your head and leaving a meeting with what was actually said. We've moved enough scope changes from a verbal aside in a Zoom to a written, time-stamped, transcript-backed record that the tool has paid for itself many times over in the prevention of pointless arguments alone.

One thing worth being explicit about. Granola is not Mac-only. It runs on Mac, Windows, and iPhone today. The Windows desktop app shipped after the original Mac launch, so an agency-wide rollout doesn't depend on what laptop someone is on. The iPhone app is the more interesting one for our workflow, because it captures phone calls, which is one of the few places Granola overlaps with what Pocket is built for. The current platform list lives on granola.ai. Anyone calling Granola Mac-only is reading from a launch-era write-up that's been outdated for a while.

What Pocket does

Pocket is what Granola can't be, because it's a piece of hardware. The product page lives at heypocket.com. The device is a small wearable mic that captures audio from the world around you, then transcribes and summarizes it the same way Granola transcribes a Zoom call.

That hardware-first approach matters. A laptop microphone is bound by what the laptop can hear. The lid is closed in a car. The machine is in another room during a kitchen-table meeting with a contractor. The Zoom client isn't running when a prospect calls back unexpectedly. None of that matters to Pocket. The mic is on the body or on the table, and it captures whatever is being said wherever you happen to be.

The use cases that actually filled up our Pocket capture history in the first month were the ones a software-only tool can't reach. Phone calls with clients who wanted to talk through a contract live. A creative review with a director on a film set. A walk-and-talk with a freelancer about scope changes on a campaign. A long drive with two voice memos that turned into a brief for a new performance marketing campaign. None of those would have made it into a notes app on a laptop.

The transcription quality, in our experience, is best on clean one-on-one audio and degrades in noisy rooms, the same as any speech-to-text product. The summaries are usable as-is for fast review, and the underlying transcript is there when an AI-generated paragraph compresses something we want the exact wording on.

The detail about Pocket that doesn't get talked about enough is that you get to point it at the model you want. We have ours running on Claude Opus 4.7 right now. Every transcript, every summary, every mind map that comes out of a long conversation is being written by the model we picked, not whatever default a product team locked us into. That choice matters more than it sounds like it should. The summary structure is different. The way action items get extracted is different. The mind maps that surface from a sprawling strategy call are noticeably more useful with the model we'd pick on our own time. If you've ever wanted your meeting notes to read the way a specific AI writes, Pocket is the only one of the two tools that lets you make that call.

Worth a quick note on the hardware itself. The device is small, sits at around 52 grams, runs about four days on a charge, mounts to a phone with MagSafe or clips to a shirt. None of that is the point of the tool, but it's worth saying that nothing about the form factor gets in the way of using it daily.

The Hardware

If you want to try Pocket

Same device we run, same affiliate link our purchase went through. Order takes a couple of minutes.

Get Pocket

The real difference between the two

The cleanest way to describe the difference is this. Granola is a software listener for digital audio. Pocket is a hardware listener for analog audio. Everything else flows from that.

Granola hears the audio your computer is producing or playing. That's why it works on Zoom calls and not on phone calls. Pocket hears whatever is happening in the room or on the line. That's why it works on phone calls and not on Zoom-by-itself, where you'd want it integrated with the laptop's system audio anyway.

If we map that against an agency day, the split looks like this.

Conversation type Right tool Why
Zoom client review Granola Calendar event, system audio, structured template fits the cadence
Google Meet kickoff Granola Same as above, joined automatically off the calendar
Microsoft Teams standup Granola System audio capture works the same as Zoom or Meet
Phone call with a client Pocket No video meeting, no system audio, hardware mic catches it
In-person prospect coffee Pocket Real-world audio, no laptop in the room
On-set creative review Pocket Hands free, multiple speakers, location-flexible
Walk-and-talk with a freelancer Pocket Wearable mic, no need to manage anything mid-conversation
Voice memo to yourself Pocket Push and talk, transcript and summary appear after

The pattern is obvious once you stare at it for a minute. The split isn't about features. It's about whether the audio is digital or analog at the moment it's produced.

What a full workday looks like with both

The reason we keep recommending both tools to other agency operators is that the daily flow stops feeling like a flow and starts feeling like furniture. There's no decision to record. There's no app to open. The capture happens on its own, and we read or query the result later.

A typical day at Market Correct, told through the two tools.

Morning drive. Pocket is on. We take a phone call with a client who wants to talk through a budget shift before the standup. The recording is captured. By the time we're at the desk, a summary is sitting in the app. The phone call became notes without anyone writing notes.

Mid-morning standup. Granola joins automatically off the calendar. The team talks through the day's priorities, who's running which campaign, what's blocking the agency tech review. The call ends. The summary writes itself, broken into action items, decisions, and open questions, the way the template asks for it.

Lunch meeting in person. Pocket clips to a shirt or sits on the table. The conversation is captured passively. Whatever was decided over a sandwich about a creative direction is now retrievable, instead of being a thing one of us tries to remember an hour later.

Afternoon Google Ads review with a client. Granola joins. Numbers move on screen. Decisions get made on bid strategy, on negative keyword lists, on a new ad group. The summary is in the inbox by the time the call ends, before anyone has typed a single follow-up.

Late-day phone call with a contractor. Pocket again. The terms of the engagement get hashed out, including a piece of scope that wouldn't have made it onto a written record any other way.

End of day. Two apps, full of structured summaries from a day of conversations. Nothing was actively recorded, the way you'd actively record on a phone or actively run a transcription bot in a Zoom call. The capture was ambient, and the working memory is complete.

That ambient capture is the real product. Not the AI summary, not the transcript by itself, but the fact that nothing has to be remembered to be captured. The cognitive cost of running a service business drops noticeably when conversations stop disappearing.

How the MCP integrations change the workflow

This is the part most reviews miss, because most reviews of these tools come from people who haven't connected them to anything else. Both Granola and Pocket ship MCP servers. That changes how the captured data feels in practice.

MCP, the Model Context Protocol, is an open standard published by Anthropic. It lets AI tools like Claude Code connect to external data sources through small server programs. If a tool ships an MCP server, any compliant client can query it. Granola has one. Pocket has one. Claude Code reads both.

The practical difference is that the transcripts stop being something to scroll through and become something to query. We don't open the Granola app to find out what a client said in a meeting two weeks ago. We ask Claude Code. The answer comes back with the source line attached. The same is true for a phone call captured by Pocket. Ask, search, answer, with the transcript right there if we want to see the original.

A few examples of the kinds of queries that have become routine for us.

  • Find the call last week where the client agreed to a flat fee instead of a percentage of spend
  • Pull every mention of a specific creative direction across the last month of meetings
  • Surface the call where we discussed the timeline change on the programmatic launch
  • List every action item from any meeting tagged to a specific team member
  • Cross-reference what a client said in a Zoom call against what they said in a follow-up phone call captured by Pocket

That last one is the move that makes both tools feel like a single system. The Zoom transcript is in Granola. The phone follow-up is in Pocket. Claude Code reads both, and it doesn't care which one a piece of information came from. The query is about the conversation, not about the tool.

If you're running an agency on Claude Code already (we lean on it for everything from SEO content to ad copy reviews), this MCP plumbing is what makes the meeting tools feel like part of the same system rather than two more apps to manage.

Want to see how a working AI meeting stack fits into a real agency engagement?

Talk to us

Where each one falls short

Both tools have honest limitations. We've used both long enough to find them.

Granola limitations

  • If the meeting isn't running through your laptop, Granola is out. A spontaneous phone call, a hallway conversation, a meeting where the laptop didn't make it into the room, all invisible.
  • The AI summary occasionally compresses a nuanced discussion into a clean bullet that loses the actual point. The transcript is the source of truth when something matters.
  • Multi-speaker tagging is good but not perfect, especially when several people on the same call have similar voices or there's background overlap.

Pocket limitations

  • It's a physical device. You have to remember to carry it. You have to remember to charge it. A great hardware tool you forgot at home is the same as no tool at all.
  • Noisy environments hurt transcription quality. Coffee shops with espresso machines, restaurants at peak hours, anywhere with strong ambient sound, all push accuracy down.
  • Multi-person in-person meetings with five or six speakers can blur in the transcript, since speaker separation in a real room is harder than speaker separation on a Zoom call.

The shared limitation

Both tools produce AI summaries. AI summaries are useful for fast review and dangerous for high-stakes recall. We treat the summary as a first read and the transcript as the actual record. Anyone using either tool as a substitute for paying attention is missing the point. The tools work because the conversations are still happening fully present. The capture is the safety net, not the meeting itself.

Pocket

The piece of the stack you can pick up today

Granola is a download. Pocket is the hardware part of the workflow, so the order has to come first.

Get Pocket

The bottom line

The bottom line

If most of your conversations happen on a laptop, start with Granola and see if that alone closes the gap. For some operators it will. If you spend any meaningful share of your week on phone calls, in-person meetings, or on the move, you're going to want Pocket too. The combined cost of both tools is small compared to the cost of losing one client conversation per quarter to bad memory.

The MCP integration is what graduates the stack from useful to compounding. Once Claude Code can search across both tools, every meeting becomes a queryable record, not a folder of notes nobody opens. That's where the real productivity comes from. The capture is table stakes. The retrieval is the actual upgrade.

For an agency that runs on conversations, two tools that together cover every conversation is a low-friction, high-return move. We use both daily. We're not going back to a world where half of our calls live only in someone's head.

Tools and Workflow

Want to see what an AI-native agency actually runs on?

We use Granola, Pocket, and Claude Code as the spine of how we run client work. If you want a paid program built and operated with the same discipline, talk to us.

Talk to us about your campaigns
FAQ

Questions about Granola, Pocket, and the AI meeting stack

Granola is an AI notetaker that joins your scheduled meetings on Zoom, Google Meet, and Microsoft Teams by listening to your computer's audio. It transcribes the conversation and generates structured notes using your own templates. There is no bot in the call, no extra participant on the attendee list. It runs on your laptop and writes the summary the moment the meeting ends. Pricing and availability are listed on granola.ai.

Pocket is a wearable hardware microphone that records phone calls, in-person meetings, and walk-and-talk conversations. The device captures real-world audio, sends it for transcription, and produces an AI summary you can read or query later. Because it's a physical mic, it's not limited to whatever audio your laptop can hear. The product page is at heypocket.com.

Granola is a software listener. It captures the digital audio playing through your computer during a video call. Pocket is a hardware listener. It captures the analog audio of the world around you through a physical microphone. That single distinction explains everything else. Granola owns scheduled, calendar-based meetings on a laptop. Pocket owns phone calls, in-person meetings, and any conversation that doesn't run through a Zoom window.

Because the agency day isn't only Zoom calls. We take client calls on a phone in the car. We sit in coffee shop meetings with prospects. We review video shoots in person with creative leads. A laptop-only tool covers maybe half of an agency operator's working conversations. A hardware mic alone misses the value of automatic Zoom join, calendar sync, and structured templates that Granola is built around. Running both means the working memory is complete, not selective.

No. Granola runs on Mac, Windows, and iPhone today. The Windows desktop app shipped after the original Mac launch, and the iPhone app captures phone calls. Anyone telling you Granola is Mac-only is reading outdated information. Check granola.ai for the current platform list on the day you evaluate it.

Yes. Pocket lets you point captures at the model you want, so the transcripts, summaries, and mind maps are written by the model you picked rather than a fixed default. We're running ours on Claude Opus 4.7 right now. The model choice changes the structure of the summary, the way action items get extracted, and the quality of the mind maps that surface from long strategy calls. Granola does not expose this kind of model selection to users. That's one of the meaningful differences between the two tools in day-to-day use.

For most working purposes, yes. Pocket captures the phone call through its hardware mic, transcribes it, and summarizes it the same way it handles in-person meetings. That removes the need to install a third-party call recording app, manage app permissions on a phone, or move audio files around manually. The recording lives in the same place as every other Pocket capture, so it's searchable next to face-to-face meeting notes.

In our experience, both tools produce transcripts that are accurate enough to act on. Granola tends to do well on clean digital audio from Zoom or Meet because the source signal is clean. Pocket does well on quiet rooms and one-on-one phone calls. Neither is perfect on noisy environments, heavily accented multi-speaker audio, or low-volume side conversations. The AI summaries are generally more useful than the raw transcript for fast review, with the transcript serving as the source of truth when something matters.

Recording laws vary by jurisdiction. Some states and countries are one-party consent, where the person recording can record without notifying others. Others are two-party or all-party consent, where everyone in the conversation must be informed. The Reporters Committee for Freedom of the Press maintains a state-by-state guide that's worth reading. Our practice at the agency is to disclose recording at the top of the meeting regardless of jurisdiction, because it's good practice and clients appreciate the transparency.

MCP, the Model Context Protocol, is an open standard that lets AI tools like Claude Code connect to external data sources through a small server program. Anthropic published the protocol so any compliant tool can plug into any compliant client. Both Granola and Pocket ship MCP servers. That means we can ask Claude Code questions like, find the call last week where the client agreed to a flat fee, and Claude can search across every transcript without us opening either app. The transcripts stop being a thing you scroll through and become a queryable knowledge base.

Both products publish MCP servers that Claude Code can register through its standard MCP configuration. Once the server is registered, Claude Code has access to the conversation history that lives inside the tool. The official setup steps live on each tool's documentation, granola.ai for Granola and heypocket.com for Pocket. From a workflow standpoint, the moment the connection is live, every transcript and summary the tool has captured becomes searchable from inside Claude Code, the same way Claude can read a local file.

Pocket is built for in-person meetings. The hardware mic sits on a table or clips to a shirt and captures the room directly. Granola can in theory pick up an in-person meeting if you put a laptop in the middle of the table and let the system mic do the work, but it's not what the tool is built for. For a real in-person meeting with multiple voices, Pocket is the right pick. For a meeting that runs through a video call, Granola is the right pick.

Then the meeting isn't recorded. We disclose recording at the top of every call we run. If a participant objects, the recording stops. If we're attending an external meeting, we follow the host's policy. The Federal Trade Commission has guidance on transparency in business communications, and the practical version of that guidance is, if anyone in the room is uncomfortable being recorded, you don't record. Trust matters more than transcripts.

They're built for it. Most agency client conversations are either video calls (Granola territory) or phone calls (Pocket territory). Having every conversation transcribed and summarized means we don't lose a decision in someone's notebook, don't have to ask a client to repeat what they said two weeks ago, and can pull the exact wording of a scope change when a question comes up later. For a service business that runs on conversations, the working memory is real. Read more about how we run a Google Ads agency day if you want the broader context.

A typical day looks something like this. Morning drive, Pocket runs while we make a couple of calls from the car. Mid-morning, Granola joins a scheduled Zoom standup and writes the summary to our template. Lunch, Pocket on a phone call with a prospect about a paid social engagement. Afternoon, Granola joins a Google Meet with a client to review a Google Ads account. End of day, Pocket on a casual in-person meeting with a contractor. By the time the laptop is closed, every conversation has a summary and a transcript without any manual work.

The biggest limitation is hardware discipline. Pocket only captures conversations if you have it with you and charged. The second is cost, two paid tools instead of one. The third is that AI summaries are AI summaries. They miss nuance, occasionally invent structure that wasn't there, and they aren't a substitute for the moments when you should actually be paying attention. Used as a working memory layer, both tools earn their place. Used as a replacement for being present, neither one is enough.