I wanted to read a book on my Mac the way Speechify does it, a voice reading aloud with the words highlighting as it goes, but I did not want to pay for yet another subscription to get it. From other project work I had recently learned that I could run genuinely good-sounding text-to-speech locally with Kokoro, a small open neural model. So I built my own reader around that.
Most read-aloud products assume your books live in someone else's cloud and that listening to them takes a subscription. Aoede takes the opposite approach. Import a document, synthesize the speech locally, follow along with synchronized word highlighting, and keep your library on your own device. No account, no subscription, no cloud.
Aoede reading Accelerando, with EPUB structure preserved, synchronized word highlighting, and floating playback controls.
The name is the idea. Aoede is the Muse of voice and song. I do not need a celebrity reading to me, just a few simple, good-sounding voices that I own. Under the hood it uses both Apple's built-in voices and Kokoro running locally through MLX. The goal is not hundreds of voices. It is a small set of genuinely good ones that work offline, with the door left open to a cloud option like ElevenLabs later if I ever actually want it.
It needs a modern Apple silicon device and macOS 26 (iOS 26 when the mobile version lands). For now it is a personal project, and I am still deciding whether to build the iOS version and put it in front of a small group through TestFlight if there turns out to be interest.
The updates below are the build log: the foundation wired end to end, the move from a CoreML Kokoro runtime to MLX, keeping bold and italic from the source files, and a long night spent debugging a robotic voice that turned out not to be my code at all.