Apple buys a lot of companies throughout the course of a year, with only a couple of them rising to the level of intriguing news. Last year’s purchase of Intel’s smartphone modem business certainly qualifies, as does
the 2018 acquisition of Shazam, but for the most part, Apple scopes out companies that we’ve never heard of for reasons we’ll never know.
Its most recent acquisition might be different. The company, Xnor.ai, might not be one you’ve ever heard of, but they’re hardly unknown. Since last summer, the Seattle-based startup’s tech has been the brains behind the popular Wyze cam’s marquee feature: people detection. Simply put, it allowed the $20 camera to distinguish between faces, pets, and dust, and vastly improved its abilities, putting it a somewhat level playing field with the far-more-expensive Ring and Nest cams of the world.
But it’s not just that Xnor.ai’s engine worked on a budget cam, it’s how it worked. Not only did it vastly improve the capability of the pint-sized recorder, but it also did it with privacy in mind. Using something called Edge AI, Xnor.ai was able to process its algorithm engine on the camera itself, meaning it didn’t need to transmit images to a far-away cloud.
That cuts to Apple’s main privacy argument. We’ve long suspected that the reason why Siri lags Google Assistant and Amazon Alexa is that Apple doesn’t collect the same kind of information that those companies do and is thus at a disadvantage. Google and Amazon may offer the ability to toggle privacy settings, but the core business model relies on data collection. It’s easier to improve AI processing when you have a mountain of data to work with, especially when you’re dealing with millions of users. But maybe it doesn’t have to be that way.
Siri safe and sound
That’s where Xnor.ai comes in, and likely why Apple deemed it worthy of several million dollars. I don’t think Siri’s development (or lack thereof) is the result of malaise or a lack of focus from Apple, but rather the capabilities of the AI engine. Apple wants to process as much as it can on the device, but the reality is that it’s just not possible on Siri’s scale, at least not without a little help.
It’s not for a lack of processing power. Apple’s A-series Bionic chips are certainly capable. The A13 on the iPhone 11 has both a faster-than-ever Neural Engine and a set of machine learning accelerators that handle more tasks than ever, but it’s still limited to practical applications specific to the device, like battery efficiency, power consumption, and graphics acceleration.
But Xnor.ai’s Edge AI engine could be the thing that brings everything together. It’s unlikely that we’ll see any fruits from Apple’s purchase in the iPhone 12 or even the iPhone 13, but Apple’s incredible silicon advancements, coupled with the kind of on-device AI processing that Xnor.ai brings, could boost Siri’s capabilities in a big way. By embedding Edge AI into Apple’s own chip via the Neural Engine or a new co-processor, Siri could be faster and far more capable, learning from what you do and prioritizing tasks in kind. And it could all work offline, tapping into the tremendous power of Apple’s system-on-chip and doing the work of a powerful cloud right on the device.
Xnor.ai estimates that Edge AI runs 10 times as fast with 15X memory than cloud-based systems, and a responsive assistant dedicated to each specific phone could finally let Apple build a voice recognition system with near-perfect accuracy. We’ve been waiting years for Siri to be able to basic things like distinguishing between different users’ voices, and Edge AI could bring a great understanding of each user’s particular cadence. After all, we already know that Xnor.ai’s Wyze AI was able to distinguish people from pets, so using to differentiate voices shouldn’t be all that difficult. That alone would go along way towards closing the gap between Siri and Google Assistant and Amazon Alexa. By building a powerful AI engine directly on our phones, Apple could do the kinds of things with Siri we want it to do without compromising our privacy.
That same engine could be applied to speech patterns. Siri dictation isn’t bad at all, but saying “period” and “comma” gets tedious. Edge AI could recognize our vocal patterns, so when we pause a certain way it adds a period, or if we change out inflection it adds a question mark.
Let’s go a step further. Say you’re texting with a friend and they ask about seeing a movie. That could trigger Siri to quietly suggest upcoming showtimes inside your conversation. Or if you copy a link in Safari, a smart suggestion could instantly present a series of apps before you even press the share button. These are the kinds of things Apple would never do in the cloud—that whole
what-happens-on-your-iPhone-stays-on-your-iPhone thing—but by using Edge AI, Apple could bring those kinds of interactions to the iPhone itself, which opens up Siri to a new world of capabilities.
And if it’s learning on the device, then it could know what app we’re in and respond accordingly. So if we’re in Photos, we could say, “Share this with my wife” and it wouldn’t need any extra clarification. Or if we’re reading a news article in Safari, we could say, “Tell me more about this,” and it would make the appropriate search. Or maybe an automatic routine could be suggested based on our app habits.
Bottom line: there’s a lot the Siri doesn’t quite know how to do, and Xnor.ai’s Edge AI engine could help teach it. All while keeping Apple’s privacy promise intact. Unfortunately for Wyze cam owners,
Xnor’s Edge Ai system has been yanked, but their loss could be Siri’s gain.
Michael Simon has been covering Apple since the iPod was the iWalk. His obsession with technology goes back to his first PC—the IBM Thinkpad with the lift-up keyboard for swapping out the drive. He's still waiting for that to come back in style tbh.