AI Tool CornerAI Tool Corner
Back to all tools
Superwhisper logo

Superwhisper

Productivity

Dictate polished text in any app with custom AI modes, offline voice models, file transcription, and privacy-focused controls.

AI dictationvoice to texttranscriptioncustom modesoffline mode

Overview

Superwhisper is an AI voice-to-text app for turning spoken thoughts into clean text across macOS, Windows, and iOS. It works in any app where you can type, supports custom modes for emails, notes, coding prompts, highlighted-text questions, and other workflows, and lets you choose from local, cloud, or custom AI models depending on your needs. It is especially useful for people who want more control and privacy than simple dictation tools provide, though the strongest offline and local-LLM experience is currently better on Mac than on Windows.

Platforms

  • Windows
  • macOS
  • iOS

Video review

Prefer YouTube? Open this review on YouTube.

Video transcript

AI voice dictation has doubled my productivity, but most of these dictation apps require a constant internet connection and they don't give you much control over how your dictation is processed. Superwhisper is the most versatile of the AI dictation apps I've tried. You can set up custom modes to exactly control how your dictation is processed. You can choose from a ton of different AI models or provide your own model via an API key. And you can use it offline when you have no internet connection or for privacy reasons. Superwhisper is available for Windows, macOS, and iOS. And I will walk you through all its features to show you how to use it to its fullest potential. And what's more, I will even give you my custom modes that I use for Cursor coding, asking questions, formatting emails perfectly, and more. And if you're not using AI voice dictation, you're wasting a ton of time because it's so much faster than typing. My name is Florian Walther and this is the AI tool corner where I review the latest AI software to find out which ones can actually improve our lives and businesses. I will put a link to Superwhisper into the video description below. You can download it from there. They have a free tier so you can try this out without any risk. Superwhisper is available for Mac, Windows, and iPhone. We will take a look at the mobile app later. For now, go ahead and download this and then just follow the installation instructions. The app looks like this. We want to open the settings and you can always open this window here in the menu bar. Right click on the Superwhisper icon and on settings. We need some minimal configuration before we can start dictating. Here on the configuration tab, we have to set the keyboard shortcuts. When you press the toggle recording shortcut, you can start dictating and then you press it again to finish your dictation. I prefer push-to-talk, which is another shortcut that I have to hold down while I'm dictating. You can change the shortcut by clicking here and then press the combination on your keyboard that you want to use. One important caveat for Windows, you shouldn't use the alt key alone because then you will get a bug with window focus. If you use the alt key, add another modifier like control. For example, I use control alt and the plus icon on the numpad. With this shortcut, you can cancel the dictation and throw it away. And you can also set a mouse shortcut like the middle mouse button, but I don't use it. I use the keyboard. Up here, we can select the microphone that we want to use. I use this one here for recording, but in day-to-day life, I actually use this cheap $20 Bluetooth headset, and the quality is good enough. Doesn't have to be a high quality mic. And then with this shortcut here, Ctrl Alt in my case, we can change the mode. We can create and configure these modes here on the mode tab by clicking on create mode. We will take a closer look at that later. For now, I just want to select the super mode. This one here, the Super mode, is the most versatile because it automatically adapts to the context and app you're using. This should be your default choice in day-to-day life. Let's give it a try with a simple dictation. And I will on purpose add some mistakes, uhms, and stutters because the super mode should clean them up automatically. We will later learn how we can customize this, but let's give it a try. So, I press the push to talk shortcut and keep it held down. Hey everyone. Um, my na name is Florian Balta and this is the uh AI Tool Corner YouTube channel. Okay, I spoke this like a total idiot, but the result is perfect. Grammar is perfect. It spelled my name correctly and the YouTube channel. It removed all the mistakes, uhms, and stutters. If anything is spelled wrong, we can fix this with the vocabulary which we will take a look at in a moment. Our dictation is automatically inserted wherever our cursor is. So this works in any app where you can put text. And if your cursor was in the wrong position and it didn't insert the text, you can always add it again with Ctrl +V. Just paste it. Or in the app, we have this history where we can copy our last dictation right here. But dictating a single sentence is easy. Let's try dictating a whole email and see how it performs. Again, I will add some mistakes and stutters on purpose. Hey, Steven. Uh yes, I can record a quick uh Loom video for you and talk about my experience with Zarif. Um here are a few points I can give you from the top of my head. Uh one, the bot often malfunctioned and got stuck. Two, uh the standard plan is too expensive for me. And three, there were too many confusing settings. Um, I hope this helps and I I wish you good luck with your product. Regards, Florian Bal. Well, this result is not great. The formatting is completely bad. It's spelled Steven wrong. It should be spelled with a PH. It's spelled Zarif wrong as well. And even my surname. Now, in this regard, Wispr Flow, which is also a voice dictation app, is better. It has better out-of-the-box context awareness. And it usually spells everything correctly without doing any additional steps. But Superwhisper has more features than Wispr Flow. So, let's see how we can fix this and still get our well formatted email. So, Super mode was not sufficient for writing emails. I'm going to switch to my email custom mode. I will show you how to set up this mode later. And another tip, I can copy this whole email together with the name of the sender. I press Ctrl +V to copy this because this email mode will read whatever I have in the clipboard. So, it will have access to this email when I copy it. Let's try dictating this exact email again. Hey, Steven. Um, yes, I can record a quick video for you and talk about my experience with Sarif. Here are a few points I can give you from the top of my head. One, the bot malfunctioned and got stuck. Uh, two, the standard plan is too expensive for me. And, um, three, there were too many confusing settings. I hope this helps and I wish you good wish you good luck with your product. Regards, FL. Let's see the output this time. Looks much better. Well, it still spelled Steven wrong and it forgot the H in my name, but we can fix this pretty easily. It spelled Zarif correctly this time, so I assume it read it from my clipboard context. It removed the uhms and stutters. It formatted the list correctly and it looks pretty good except for these two little mistakes. I think they are acceptable. I will show you how to set up this email custom mode in a moment. If Superwhisper spells anything wrong like your name, you can add it to the vocabulary. Just insert it here like my name is Florian Balta and add it to the vocabulary. It's already inside it. I also added AI tool corner. And another cool feature is you can also add text snippets by using replace with instead of add to vocabulary. I've done this down here. My YouTube link gets replaced for my actual YouTube link. Let's give this a try. Hey everyone, subscribe to AI tool corner at my YouTube link. And it replaced my YouTube link for the actual link. Now, I'm still in email mode, that's why it added this empty line here. I'll show you how to automatically switch between different modes depending on an app or a URL later. And you can also insert much bigger text snippets. For example, let's trigger this with insert email template, replace with, and here I insert this large template. And then when I want to send this email template, I just say insert email template. And there it is with the same formatting and placeholders. This can save us a lot of time. The most powerful feature of Superwhisper are its custom modes. With modes, we can decide exactly how the LLM should process our dictation. We can choose which model we want to use and a lot of additional configuration. When we click on create mode, we have a bunch of predefined modes that we can pick from. and they all have a short description. For example, the note mode organizes your dictated content into a clear note format with headers, bullet points, and structured sections. Let's select this and give this a try. For each mode, you can pick the language that you want to use. And there's also an auto option somewhere in here, automatic, which automatically detects the language, so you can use it with different languages. For me, that's English and German. And the coolest thing is you can even pick the language model and the voice model to process your dictation. That's something other AI voice dictation apps usually don't offer. The voice model takes care of the raw voice input and the language model is the LLM that does the post-processing like removing mistakes, uhms, and stutters. As you can see, we can pick from all the popular models GPT, Claude, Gemini. We can even add a custom model. I will show you how to do that later. And there is always a trade-off between speed and accuracy. The more accurate a model is usually the slower it also is. Which one you pick depends on the mode and what you want to dictate. Let's keep the default for now. Then you can set an app that automatically activates this mode. This is extremely useful. For example, I can set this to a notepad. And then when I dictate a notepad, it will automatically switch to this note mode. We have some additional options. What do we want to do with the system audio? I keep this on mute. usually which makes it easier to dictate. There are some additional settings that we will take a look at later. Let's try out the note mode and this should automatically switch to the correct model when I start dictating. So I have super selected but it should automatically switch to the note mode as soon as I press the keyboard shortcut. I need to buy the following things in the grocery store. Bread, butter, and chocolate. And then I also need a few things from the butcher. lamb chops, beef steaks, and fat trimmings. So, it automatically switched to the note mode, which processes our input a bit differently. Instead of transcribing our dictation verbatim, it turns it into a structured note. And this is really useful. But instead of these predefined modes, you can also create your own custom mode by selecting blank here under create mode. And here you can insert your own instructions. And this is how I have configured this email custom mode that I showed you earlier. I added my own instructions. Format the dictation like an email without changing its content or adding words. You can remove unnecessary filler words conservatively. Apply empty lines liberally to add breathing room. See the provided example. There is also a predefined email mode, but I didn't like this one because it changed too much about my email. That's why I set up my custom mode. And I will put a link with all my custom modes, the instructions, and the configuration into the video description below. You can copy them and use them yourself. For the LLM, I picked a slower one with high accuracy because emails are usually long, and I don't care if the transcription takes 2 or 3 seconds longer, but I want to make sure it's correct. This is why I pick a slower model with high accuracy. In this case, I used Sonnet 4. The only feature that has not been working on Windows for me is this custom domains feature. Theoretically, this should work like the app feature. When we start dictating on this domain, it should automatically pick this mode. I added the Gmail domain, but when I start dictating on Gmail, it doesn't switch the mode automatically. Instead, we are still in the note mode. This means we still have to toggle manually via our shortcut. Select the correct mode. And this is annoying. And it's especially annoying when you notice it after you've already finished a long dictation. However, we can reprocess our dictation in a different mode. I will show you how later. As we have seen earlier, it works for the app feature but not the URL. In custom modes, we can also select which context should be included. Information about the current application, the text in our clipboard that we copied, and the text that we have selected. So, highlighted like this. You should only include the parts that you actually need to not pollute the context with unnecessary information. For emails, I included everything. And then below, you can also provide examples to make the output more exact. So the input looks like this, completely unformatted. And then as an example, I provided the same text as a formatted email. This shows the AI what I expect. You have seen the output of this email mode earlier. Again, you can copy it from the link in the video description. You can also change the icon of your mode, rename it, and you can reorder these modes or of course delete them here in the settings. Let me show you some more cool examples for programming. I have this coding cursor mode. The instructions look like this. Again, you can get them from the video description. This is a custom mode that automatically activates when I use the cursor IDE. It has everything in the context. And here I provided some examples of the output I expect. And then inside the IDE, I can dictate stuff like this. Open the drop zone.tsx file and fix the use drop zone function. And Superwhisper spells file names and function names correctly. Now, it doesn't automatically take the file like Wispr Flow does, but just entering the file name is usually enough for the agent to find it by itself. Then I have this instructions mode. Again, I will put a link to all my modes into the video description. This one treats my input as instructions and not as verbatim dictation. For example, I can say something like this. Uh, I want to fix my diet, so please write down a grocery list with only healthy items on it. And now it will interpret my dictation as instructions and create a grocery list. You can use this for a ton of useful things. Then I've created another custom mode called query highlighted text. For this one, I only added the selection and the application context, but not the copied text because it should ignore the copied text and I don't want to pollute the context. Also, I turned auto paste off because I don't want to insert the answer. I just want to read it. And then I can select some text, for example, our grocery list. And I can ask it what's the total cost of this grocery list roughly? Only give me the final amount. And it will not insert any text. Instead, it will just give me this output. So, it estimated $75 for this whole list. You can also do some goofy stuff with custom mode, like this alternating case mode. Let's give this a try. I select alternating case and then I can dictate something. We need to stop global warming right now. And it inserts it as alternating case like this. You can get really creative with these custom modes. Again, all my custom modes with the instructions and the configuration are in the video description below. You can copy them from there. I think I've said this like five times already. And in a second, I will show you how to use Superwhisper offline, but first I want to take a closer look at the history here. We already learned earlier that we can copy our previous dictation. We can play back our microphone input. And super useful, we can reprocess the dictation. So let's say I use the wrong mode. I didn't mean to use alternating case mode, but instead of dictating this again, I can just reprocess it in a different mode. for example, Super mode and then it gives me the correct output and I can paste it here. This is especially useful for longer dictations where you use the wrong mode. So you don't have to spell everything again. You can also find more details about this dictation and how it was processed. And when you open the file location, you find this meta.json file. You can open this in a text editor. And here you can see everything the AI saw when you processed this dictation like the exact prompt and instructions, the available context and what was inside it, the model it used and so on. You can use this to debug your dictations if something doesn't work properly. Now the default mode here at the bottom is special because this one doesn't have a language model only a voice model and you can even download a voice model onto your computer and then use it offline. Let's try this out. I will cut off the internet connection and with the default mode I can still dictate. Hey everyone, um my name is uh Florian Walther and this is the AI Tool Corner YouTube channel. The input is very fast, but the default mode doesn't do any LLM post-processing. That's why it doesn't remove the uhms. It also doesn't use the vocabulary, so it spelled my name wrong, but it got AI tool corner right by chance. This is useful if you are on a plane and you don't have an internet connection or you need more privacy for example in a medical job where you don't want to send information to some remote LLM server and most competitors like Wispr Flow don't have this at all they only work online. However, Superwhisper is much better on macOS in this regard because on macOS you can also download LLMs locally on your computer. On Windows you can only download a voice model and have zero LLM postprocessing when you're offline. However, if privacy is important to you, you can still use a language model, but pick one of Superwhisper's own models like S1 language, which is hosted by Superwhisper. This way, you don't send your input to some third-party API like these other providers. And in Superwhisper's privacy policy, they explicitly state that they don't use your input to train models and they don't retain anything on their servers. Alternatively, you can also use a model that your company hosts on their own servers. I will show you how in a second. The enterprise mode is also SOC 2 Type II certified if this is relevant for your company and the admin of your company can also restrict what models can be used by employees. So you have safeguards in place if privacy is important to you. If you want to provide your own custom model, you can do this here under create custom. You can provide an API key for one of the big providers or your own custom model. If your company hosts its own model on some server, you can insert the data here and access it directly. You don't have to rely on any of these big providers like OpenAI, Anthropic or Google. Superwhisper has some other cool features that its competitors don't have. For example, we can transcribe a file. I can select an audio file from my computer. Superwhisper transcribes the file and gives me the text output here. This also works with long audio files. It just takes a bit longer. You can also transcribe meetings with the meeting mode. You can for example activate this when you open Zoom or Teams. Identify speakers is activated by default. So it will transcribe each person separately and tag the correct speaker. If you want to transcribe a live meeting, you can do that with the mobile app. We will take a look at it in a moment. And if you enable record system audio, you can get really creative and for example transcribe a YouTube video or summarize it with a custom mode. You can get very creative. On macOS, you also have integrations for Superwhisper with Raycast and Alfred. For Windows, there isn't as much support. And there is also an iPhone app. I haven't tried this out myself because I'm poor and I only have an Android, but it has 4.4 stars and it seems to be pretty good and have the same features as the desktop version. I will give you my final verdict and who should use Superwhisper in a second, but first let's take a quick look at the pricing. So, Superwhisper is cheaper than Wispr Flow and most of the other alternatives. And what's more, students even get another 40% discount. Also, there is a lifetime license available, which I have because I prefer paying a larger fee once and then be done with it. It's also pretty cheap. I've seen other voice dictation apps that cost twice as much. There's also a completely free tier available, but for long-term use, this is not great because it uses worse AI models. However, when you download Superwhisper, you also get 15 minutes of pro usage, so you can try it out without paying anything. Again, the link is in the video description below. Now, important, the Pro version works on all devices, so you only have to pay once, and you can use it on desktop and iOS, and you also get a 30-day no questions asked refund if you don't like it. So, what's my final verdict? I'm going to be completely honest. In day-to-day life, I use Wispr Flow and not Superwhisper. If you want to know why I like it more, I have a full comparison video between Superwhisper, Wispr Flow, and another alternative. I will put the link into the video description and also in the end card at the end of this video. However, Superwhisper can be the right choice if you need flexible custom modes with your own instructions—Wispr Flow doesn't offer that yet—or if you need offline support, since Superwhisper's default mode works offline and Wispr Flow does not. However, it's important to note that on Windows offline you only have a voice model and no LLM postprocessing. On macOS, you can download an LLM and have both. So, this is vastly better on macOS. You can transcribe files and meetings. You can buy a lifetime subscription. Whereas for Wispr Flow, you have to pay every month. And as you can see from the change log, it's in active development and gets regular updates. Again, I will put the link to Wispr Flow into the video description below. You can try it out for free. And if you want to see why I prefer Wispr Flow, check out the full comparison video here on the end card. Subscribe to the channel if you haven't yet. And then I hope I see you in the next video. Take care.

Standout features

System-wide voice dictation
Use keyboard shortcuts or push-to-talk to dictate into emails, documents, chat apps, browsers, coding tools, and any other place where text can be entered.
Custom AI modes
Create modes with your own instructions, examples, model choices, context settings, and automatic app activation so different workflows produce different output styles.
Model selection and custom models
Choose from built-in voice and language models, connect external providers with your own API keys, or point Superwhisper at company-hosted models when you need more control.
Offline and privacy-focused options
Use local voice models for offline dictation and keep sensitive audio on the device, with enterprise controls for model access and official SOC 2 Type II and HIPAA-oriented offerings.
Vocabulary and text snippets
Add names, brand terms, abbreviations, and replacement snippets so repeated phrases, links, templates, and specialized vocabulary come out correctly.
History and reprocessing
Review previous dictations, copy or paste them again, play back the original audio, inspect processing details, and reprocess a recording with a different mode.
File and meeting transcription
Transcribe audio or video files, record meetings, separate speakers, and use system audio capture for more flexible transcription and summarization workflows.

What it's great for

  • Write emails, messages, notes, and documents faster than typing
  • Dictate longer prompts into Cursor, Claude Code, or other AI coding tools
  • Create custom modes for emails, structured notes, instructions, or selected-text questions
  • Transcribe audio files, video files, meetings, or system audio
  • Use local dictation when offline or when sensitive audio should stay on the device
  • Standardize names, product terms, links, and reusable templates with vocabulary replacements

Pros & cons

Pros
What works especially well
  • Custom modes give much more control over tone, formatting, context, and post-processing than basic dictation apps
  • Works across Mac, Windows, and iOS with one Pro license across personal devices
  • Local voice models make offline dictation possible and reduce privacy concerns for sensitive work
  • Vocabulary and snippet replacement are useful for names, brands, templates, and repeated links
  • History and reprocessing help recover from choosing the wrong mode without redictating everything
  • File transcription, meeting recording, speaker separation, and custom model support make it more flexible than a simple voice typing tool
Cons
Trade-offs to know upfront
  • Out-of-the-box context awareness and formatting can be weaker than some competitors unless you configure custom modes
  • Automatic domain-based mode switching may be unreliable, so some web workflows still require manual mode changes
  • Offline support is stronger on macOS; on Windows, offline dictation can lack LLM post-processing
  • Longer or more accurate modes can take a few seconds longer because better models are usually slower
  • Modes, vocabulary, and examples take setup time before the app reaches its full potential
  • The free tier is useful for trying the app, but frequent advanced use depends on Pro features

Best for

  • Writers, founders, creators, and knowledge workers who dictate a lot of text every day
  • Developers who want to speak detailed prompts into AI coding assistants
  • Power users who want custom formatting rules for emails, notes, instructions, and queries
  • Privacy-conscious users who need local or offline transcription options
  • Teams that need model access controls, SSO, centralized billing, or enterprise security assurances

Verdict

Superwhisper is best for people who want a configurable dictation system instead of a simple speech-to-text box. It takes more setup than the most automatic alternatives, but custom modes, model control, offline dictation, vocabulary snippets, and reprocessing make it a powerful choice for power users, developers, and privacy-sensitive workflows.

FAQ

What is Superwhisper used for?

Superwhisper is used to dictate polished text into any app, including emails, notes, chat tools, browsers, documents, and AI coding assistants. It can also transcribe files and meetings.

Does Superwhisper work offline?

Yes. Superwhisper supports local voice models for offline dictation. The offline experience is strongest on macOS, while Windows offline use can be limited to voice transcription without the same LLM post-processing.

What are Superwhisper custom modes?

Custom modes let you define how dictation should be processed for a specific workflow. You can write instructions, provide examples, choose models, include clipboard or selected-text context, and activate modes automatically in selected apps.

Can Superwhisper be used for coding?

Yes. Superwhisper works in coding tools such as Cursor and VS Code because it types wherever your cursor is. Custom modes can help preserve file names, function names, and technical instructions for AI coding assistants.

Does Superwhisper keep my audio private?

Superwhisper offers local processing options and states in its privacy policy that it does not use user data for model training or retain data on its servers. Cloud and third-party model workflows depend on the selected model configuration, so privacy-sensitive users should choose local or approved enterprise models.

Is Superwhisper free?

Superwhisper has a free tier for core dictation and limited advanced usage. Pro unlocks the deeper workflow features such as custom modes, AI-powered modes, more model access, custom vocabulary, file transcription, speaker separation, and priority support.

More in Productivity