I don’t know about you, but I talk to my iPhone all the time: “Hurry up, iPhone”; “How come I can’t press this button!?”; “Why, iPhone, whyyyyyyyy?” Most of the time it ends in tears—for me, anyway; I suspect the iPhone is silently laughing at me. However, rumors suggest that the iPhone may soon be more receptive to my pleas: Ars Technica reports that uncovered software frameworks in the iPhone 3.0 beta might represent speech recognition and synthesis systems.
Of course, this isn’t exactly out of left field. Both the latest iPod nano and iPod shuffle include speech synthesis capabilities, dubbed Spoken Menus in the 4G nano and VoiceOver in the iPod shuffle, that allow users to navigate the devices without having to look at the screen. In both cases it helps people who are visually impaired use the devices, but in the shuffle it’s also a necessity, since the device has no screen and a potentially confusing control scheme.
The iPhone and iPod touch are extremely difficult for visually impaired users to interact with, as they have little in the way of tactile cues or feedback. Voice recognition in particular has been a heavily requested feature, as it could improve not just accessibility, but everyday tasks such as dialing a number without having to look at the phone’s screen—handy for when you’re driving, for example.
The Mac OS has long had both speech recognition and synthesis features, so it seems natural that Apple would leverage the research, if not the technology, learned from those to beef up the iPhone’s capabilities. And, unlike the shuffle and nano, which rely on sound files generated on the user’s computer and then synced to the iPod, the iPhone and iPod touch’s superior power could conceivably allow the devices to do the heavy lifting themselves.
You’d still need a way to trigger the voice-recognition system, but there are a couple solutions. For voice dialing, the most obvious is to be able to squeeze the iPhone’s headphone control to trigger the feature—à la the current ability to answer or hang-up on a call (you should also be able to trigger it from compatible Bluetooth headsets, like most common cell phones). But I also think Apple should take a cue from the voice recognition in the Google Mobile App, which triggers when you hold the phone up to your ear. If Apple built that functionality into the iPhone’s Phone app, then you could switch to the Phone app by using the Home button double-press feature, then hold the phone up to your ear and tell it what to dial. Presto.
Of course, the frameworks uncovered by Ars Technica—reputedly code-named “Jibbler”—have not been formally announced by Apple, so there’s no telling whether these features will make it into the final shipping version of 3.0, or whether Apple’s laying the groundwork for a future update—or if, indeed, this is all just a pipe dream. Certainly, the idea that my iPhone might know that I’m talking to it is both comforting and terrifying at the same time.