These days, our lives are littered with the half-built scaffoldings of intelligent assistants and virtual agents. Voice-based interfaces are at a level with technologies like home automation and virtual reality: Popular enough to have seeped into our lives, but not yet refined enough that they have become fixtures for most of us.
Apple’s version of Siri turns five years old this month, but as I’ve before discussed, it doesn’t seem to have progressed as much as one might have hoped. This week, veteran tech journalist Walt Mossberg penned a scathing indictment of Apple’s voice-based assistant, in which he posed the question that most of us have asked at one time or another: “why does Siri seem so dumb?”
He’s not wrong. While I’ve had better luck than Mossberg in some of my interactions with the feature, I run up against rough edges pretty much every single time I try to use Siri for anything. Most of my iPhone-using friends tend to view Siri as more of a curiosity than a useful tool. Last year I put forth some ideas about what a Siri 2.0 should include, but let’s take a step back and look at the bigger issues here.
Putting out fires
Mossberg’s article brings up a number of specific issues: Siri’s inability to identify the presidential candidates, cluelessness about the date of the presidential debates, and even providing weather for the town of Crete, Illinois instead of Crete, the island. But it’s less the problem than the cure that bugs me here, as Mossberg says:
If you try most of these broken examples right now, they’ll work properly, because Apple fixed them after I tweeted screenshots of most of them in exasperation and asked the company about them.
Here’s the thing: if that’s what it takes to get new functionality in Siri, then—permit me to wax only slightly hyperbolic—it’s doomed. This isn’t the kind of feature you can address on a case-by-case basis. It’s a bit like the trouble that Apple had when it first transitioned away from Google Maps to its own system: if people ask once and get the wrong answer—or, in the case of Siri, no answer at all—they’re going to be dubious from there on out. Even more so with a feature that’s less tried and true, such as voice-based assistants.
Apple did tell that Mossberg it’s constantly working behind the scenes to try and make Siri better for the tasks people use it for the most:
Apple stressed to me that it’s constantly improving Siri, and also stressed that it focuses its Siri efforts on the kinds of tasks that it says millions of people ask every day: placing phone calls, sending texts, and finding places. It puts much less emphasis on what it calls “long tail” questions, like the ones I’ve cited above, which in some cases, Apple says, number in only the hundreds each day.
To my mind, that just raises more questions. Do people use it less for those “long tail” questions because it doesn’t tend to give helpful answers? And if Apple doesn’t improve the handling of those kinds of queries, will people just give up on asking them, thus meaning they end up using Siri less as a whole? It’s somewhere between a self-fulfilling prophecy and a chicken-and-egg problem.
Right now, Siri and most of the other voice-based assistants on the markets are still far from that “uncanny valley” moment where they’re so close to humans that people get creeped out by their interactions. It’s still abundantly clear that you’re talking to a machine, because they have roughly the same level of intelligence as, say, your dog. (No disrespect meant to our canine companions.) A dog, like Alexa or Siri, knows his name, and can perk up when he hears it. You can give a dog a specific command and have him understand you, but his ability to interpolate new instructions based on what he already knows is generally pretty low.
For example, if I could point to one thing that I find holds back Siri and its competing intelligent agents, it’d be pretty simple. Just a little word that we use probably dozens if not hundreds of times a day (25 times in this article alone!): “and.”
The idea of “and” is simple to our human brains, and yet incredibly complex to a virtual agent. Tell your significant other to put milk and orange juice and bananas on the shopping list and chances are they’ll do what you’re expecting. Tell Siri and, well, you end up with something pretty different—and far less useful.
Similarly, try joining two commands together. If you tell your kids to turn off the lights and the television, they can probably handle that (at least past a certain age). When I do that with my Amazon Echo, it means issuing a command, waiting until it’s resolved, and then issuing a second command. It’s a bit eyeroll-inducing.
If you’re so smart, why aren’t you rich?
Apple’s put a lot of energy into machine learning and AI. Back in August, veteran tech journalist Steven Levy did a deep dive into Apple’s efforts in those arenas. Perhaps the most interesting revelation was just how small a part of it Siri seems to be—maybe, given Siri’s current powers, a bit too small.
Right now it may be okay that our virtual assistants aren’t smarter than the average fifth-grader—we probably wouldn’t want fifth graders overthrowing our society either. (Although cake at every meal sounds pretty great.) But in order to get these intelligent agents to the point where they’re as useful as they’ve been envisioned, it’ll take a significant leap from where we are today. And, frankly, they’re probably going to need to know the meaning of the word “and.”