The problem with macOS dictation
The built-in dictation on macOS is a transcription engine, not a dictation engine. It transcribes word by word as you speak — which means it misses the complete context of your sentence. Say "their" and it has no idea if you meant "there" or "they're" until it's too late. It guesses, and it guesses wrong constantly.
Worse, it only works in one language at a time. If you speak Spanish, French, and English — which is my case — you have to manually switch the dictation language every time you change languages. Mid-sentence code-switching? Forget it.
I'd been living with this for years. Then I started texting more on my phone and using dictation heavily on mobile, and it was equally bad there. I knew there had to be a better way.
Finding the right model
I started looking at what was available on Hugging Face and found that NVIDIA had released Parakeet — a family of speech-to-text models built in collaboration with Hugging Face. They come in two flavors: English-only and multilingual, and they weigh only around 500 MB. Small enough to run on virtually anything with a CPU and an ML core.
Unlike word-by-word dictation, these models process your entire utterance at once. They understand context, handle punctuation naturally, and get the right homophone almost every time. The multilingual variant handles 100+ languages without switching — you can speak English, switch to Spanish mid-sentence, and it just works.
$28 million for something I could build myself
While researching, I came across WhisperFlow — an app doing essentially this same thing. Then I found out the founders had raised $28 million for it. I also found MacWhisper, another app in the same space. I'd actually been using both for about six months the previous year.
But I knew I could build something better and more tailored to how I actually use dictation. So I took on the task. I first built it for my phone, where I was texting heavily and the stock dictation was especially painful. Then I realized the same approach worked even better on Mac — especially for talking to AI, writing emails, and drafting long messages.
$28M
raised by WhisperFlow
$0
cost to build lowercase
Free
forever, open source
How lowercase works
Press the dictation key (F5), speak naturally, press it again. The app captures your audio, runs it through the Parakeet model entirely on-device — no internet connection required — and pastes the transcribed text into whatever app you're using. Your voice data never leaves your Mac.
100% offline
No cloud, no API calls. Everything runs on your Apple Neural Engine.
Multilingual
100+ languages, no switching. Speak English then Spanish — it handles both.
Context-aware
Processes full sentences, not individual words. Gets homophones right.
Bringing it to Android
After building lowercase for macOS, the next logical step was Android. I text constantly on my phone, and Android's stock dictation has the same problems — wrong homophones, no multilingual support, and it sends everything to the cloud.
The Android version takes a different approach to interaction. Instead of a keyboard shortcut, it uses a floating bubble overlay that sits on top of any app. Tap the bubble to expand it into a recording pill with a live waveform visualizer, speak, and tap the checkmark. The transcribed text gets injected directly into whatever text field you're using via the Accessibility Service — or copied to your clipboard if you prefer.
It uses Android's built-in SpeechRecognizer for on-device transcription, so there are no extra downloads or API keys needed. The entire APK is about 15 MB.
~15 MB
APK size
0
API keys needed
On-device
speech recognition
Now on the App Store
The iOS version brings everything full circle. It runs the same NVIDIA Parakeet model on Apple's Neural Engine — the same accuracy as the Mac version, but on your iPhone. No cloud, no API keys, no data leaving your device.
The killer feature is the custom keyboard extension. You install it once, enable the lowercase keyboard, and from then on you can dictate directly into any text field in any app — Messages, WhatsApp, Notes, email, anything. Switch to the lowercase keyboard, tap the mic, speak, done. It also supports push-to-talk for quick bursts, file transcription for audio and video files, and keeps a full history with word-per-minute stats.
Getting it on the App Store was a milestone. It means anyone with an iPhone can install it in seconds — no sideloading, no TestFlight, just search "lowercase" and download.
App Store
publicly available
Parakeet TDT
on Neural Engine
Custom Keyboard
dictate in any app
Now on Windows — via the web
The most requested feature was Windows support. Instead of building a native Windows app from scratch, I took a different approach: bring lowercase to the browser. The Web Speech API in Chrome and Edge provides real-time speech recognition that works on any platform — Windows, Linux, or macOS.
Visit lowercase.click on any Windows PC, click the mic button (or press F5, just like the native app), speak, and get your transcription. No install, no setup, no accounts. The same push-to-talk workflow, the same multi-language support, the same lowercase-by-default output — all running in your browser tab.
It's not a watered-down version either. The web app includes a live audio volume visualizer, 20+ language options, a word counter, and one-click copy to clipboard. For Windows users who've been waiting, this is the fastest way to start using lowercase.
0 MB
download size
20+
languages
Instant
no install needed
What's next
NVIDIA made the Parakeet model publicly available, but only for CUDA — meaning it's optimized for NVIDIA GPUs. Macs and iPhones use a converted version of the model to run on Apple's Neural Engine. I'm currently working on my own version of that conversion to minimize the model size and increase inference speed specifically for Apple Silicon.
On the Android side, I'm exploring running the Parakeet model directly on-device using NNAPI acceleration, which would bring the same multilingual accuracy from the macOS and iOS versions to Android phones — without depending on Google's SpeechRecognizer.
The engine now runs on Mac, iPhone, Android, and the web. The next step is making it smaller, faster, and more deeply integrated into each platform — better keyboard experiences, Live Activities on iOS, smarter text injection across the board, and eventually a native Windows app with the same Parakeet model running on NVIDIA GPUs.
Try lowercase
Free, open source, and runs on your Mac, iPhone, Android, or right in your browser on Windows. Replace your dictation workflow in 30 seconds.
