A short while after localizing the interface, Aoede reads Japanese aloud. Open a Japanese book and it detects the language, picks a Japanese Kokoro voice, and reads with the same synchronized word highlighting English already had. Settings has a Reading Language picker (Automatic, English, Japanese) for when detection guesses wrong, and the app remembers your chosen voice per language.
The interesting problem was that Japanese has no spaces between words. The reader's highlight tokenizer split on whitespace, so a whole Japanese sentence arrived as one enormous "word." It overflowed the line, and the highlight covered the entire thing instead of tracking word by word. The fix was to tokenize Japanese with CFStringTokenizer, the same engine the speech G2P uses, so the on-screen word boundaries line up exactly with the per-word timing the highlight follows. Japanese paragraphs lay out with no word spacing, so the characters flow naturally while still wrapping line to line.
Getting the word boundaries right was only half the problem. Pronunciation is the other half, and a single constraint shaped everything about it: all of this runs on Apple frameworks alone, with no GPL dependencies. That rules out the usual open-source pronunciation engines but keeps the app clean to ship, and it has a cost I'm paying down now. The current readings come from the dictionary's first guess, so 私 comes out watakushi instead of watashi, and there is no pitch accent, which in Japanese is the difference between 箸 (chopsticks) and 橋 (bridge). I'm working on a second pass built on OpenJTalk, which is BSD licensed rather than GPL, to get accurate readings and real pitch. For now it reads Japanese well enough to actually listen to, which a day ago it could not do at all.