Review: MacSpeech Dictate 1.2.1
At a Glance
MacSpeech Dictate is the first speech recognition program for the Mac that offers recognition as good as similar Windows software, such as Nuance’s Dragon NaturallySpeaking. In fact, MacSpeech has licensed Nuance’s acclaimed NaturallySpeaking technology; they have not ported this software, but rather rewrote it from the ground up to make a true Mac program.
After you install the program and connect its included noise-canceling USB headset, you set up a profile: this is user-specific, and depends on your voice, microphone, computer, and dialect (you can choose from US, British, Australian, and Singaporean English, as well as something called “US Teens”). You next have the program adjust your microphone’s volume, and then you read a brief training story. This takes about ten minutes, at which point you’re ready to start talking to your Mac. The program’s accuracy is impressive right away, as long as you don’t throw too many uncommon words at it.
When dictating, you need to speak clearly and naturally. You must be consistent and avoid words like “um” and “ah.” In addition, you have to speak every punctuation mark and you must speak certain control words to, for example, capitalize letters or insert paragraph breaks. You also need to learn a series of commands to control the program and edit text, some of which are provided on a quick reference card. It would be useful to have a printable document that groups all available commands; they are scattered across a number of pages of the manual.
While MacSpeech Dictate is remarkably accurate with commonly-used words and has a training feature that lets you teach the program new words, getting it to work with unusual words can be frustrating. For example, while I tried to train the program to recognize my name a dozen times, it consistently failed. MacSpeech technical support recommended that I create a Text Macro for my name; while this works, it requires that I say my name in isolation. (“My name is [pause] Kirk McElhearn.”) If I say my name as part of a longer sentence without pausing, the program fails. This can be a serious drawback for anyone who wants to dictate professional vocabulary—if you’re a doctor or a lawyer, say—or for someone whose company name, product names, or other unique terms are not recognized. It can be difficult to create text macros for each of these words and to dictate them in isolation so that the text macros will function. In addition, while you can use Vocabulary Training to input words from your documents to your Dictate profile, you aren’t given the option to pronounce those words to help the program get used to them. This means that even when you add specific words that the program doesn’t know, you still need to train the program to recognize them when you say them in sentences.
When the program misinterprets what you’ve said, you can train it to learn the correct words. But editing can be laborious. The commands you use to move around in text, to select words, and to edit them work most of the time, but the program seems to get confused after you’ve dictated for a long time. Attempting to go back to the beginning of a text to make edits can be fraught with problems. In some cases, when you attempt to train words, the program selects more than what you want. In other cases, even when you train them correctly, and choose a version from the Recognition palette that you have typed manually, the program inserts the words with incorrect capitalization. While the most efficient way to dictate would be to talk for a while—a paragraph or two, or even a page—then edit, this is more trouble than it’s worth. I found myself editing phrase by phrase, as this was the best way to correct mistakes, but it requires dictating in fragments. And you can’t dictate and edit with the keyboard; that throws the program off, and the results will look like Klingon.
There is also a Spelling Mode, which you can use for the most recalcitrant words, and for abbreviations and URLs. But this, too, is fraught with difficulty; it’s very slow to use, and suffers from poor accuracy.
Dictate worked well with every program I tried. I dictated into text editors, word processors, e-mail programs, and iChat, and the results were equivalent. However, Dictate uses three floating palettes to provide information and to help you edit text and control your Mac. With a large screen, this is no problem, but if you’re working on a laptop you may find the amount of screen real estate taken up by these palettes frustrating. You can resize them, of course, but even at their narrowest they take up a fair amount of space. The recognition palette needs to be large enough for you to see enough phrases so you can correctly train and edit the text that you speak.
Dictate can also let you control applications. This works well, though I only tested it with the most common applications, such as Mail, Safari, BBEdit, and Word, as well as launching applications and switching among them. Needless to say, for those unable to control their Macs in other ways, this is a godsend; while some quirks exist, this function is, for the most part, reliable. However, there can be problems. While dictating into a word processor, I tried to use the Save This Document command to save the file I was working on. Each time, the program displayed an Open window. It took me a while to realize that the problem was not the command itself, but the Dvorak keyboard layout that I use. For some reason, when Dictate sends a Save This Document command, telling Mac OS X to send an S keypress, the system sends an O. This seems to affect Mac OS X 10.5.6, and not earlier versions of the OS. MacSpeech was trying to get to the bottom of this at press time, but if you use a non-standard keyboard layout, you may have problems with some of the commands that are used to control applications.
My testing was done on a Mac Pro with a pair of 2.66GHz Xeon 5100 processors (four cores total) and 4GB of RAM. I tested in my quiet home office; your results might differ if you use the application someplace with a lot of background noise, though the included headset is noise-canceling. However, I tested the program with music playing softly from my computer speakers, and recognition was just as good as in silence.
Macworld’s buying advice
If you can’t type, or have RSI problems, this program is a must-have; if you type slowly, you’ll find it to be a boon to your productivity. If, like me, you can touch-type relatively quickly, it will take a lot of work to train the program so you can dictate faster than you type. However, I find it much more relaxing to lean back in my chair and dictate than to be in a static position and type. Over time, as the program’s accuracy improves, it can make entering text into your Mac much more relaxing and efficient.
[Senior contributor Kirk McElhearn writes about more than just Macs on his blog Kirkville.]