Intelligent assistants like Alexa, Siri, and Google Assistant perform tiny miracles like hands-free Google searches, turning on the lights, checking our calendars and the weather, setting timers in the kitchen, and even making calls. Parents love that kids can play music and find information on their own without touching mom or dad’s phone. But until intelligent assistants can go beyond these narrow tasks, their value will remain limited.

Voice-driven user interfaces offer people great convenience, but three major barriers keep intelligent assistants from achieving maturity in the market: inadequate discovery methods, inconsistent experiences across devices and apps, and an absence of tailored recommendations.


Discovery means two things: finding out which skills and services are available through the intelligent assistant, and finding out how to use them.

A voice-driven interface means that you can’t see a list of services the assistant can access. This is a big change – people are accustomed to using devices with graphical interfaces where they can scroll to access their most used apps and discover new ones.

Most of the assistants attempt to solve this problem by pairing the audio experience with an app or website that lists available skills. In our opinion, this is a half-baked solution – it puts more cognitive load on the customer and limits the usability and capacity of the assistant. It’s especially acute with Alexa, which has the largest and rapidly growing catalog of offerings at over 15K Skills. So many choices makes it difficult for customers to learn about important new Skills when they become available such as iCloud Calendar and Calling.

The Alexa Show attempts to solve the discovery problem by highlighting new Skills on its screen – though that real estate could easily turn into ad space. It’s a tough problem without a silver bullet solution, one that app stores have struggled with for years.

Once you know a Skill exists, how do you learn to use it? By now, most of us have been trained by Siri to talk like a robot–“set timer four minutes”–because we learned early on that talking to it like it was intelligent would result in, “I don’t know how to help with that.” Making a phone call or sending a text is now an easy ask, but many intelligent assistants still struggle with natural language syntax. People are required to adapt to the input needs of the assistants rather than the other way around.

This is highly evident with Alexa, which needs its commands in a standardized form. For example, even if Fitbit is the only activity Skill you enable on the platform, you can’t simply ask, “Alexa – how many steps have I taken today?” Instead, you must first invoke the app itself and then make the query, “Alexa ask Fitbit how many steps I’ve taken.” This setup puts the impetus on customers to know and remember the required syntax of each query for each app within the device.

Experience inconsistency

Another perplexing issue is that intelligent assistants do not have the same capabilities across devices. When we ask Google Assistant to send an email on the Pixel phone, it works. But not so on Google Home. Not only is this dead end a confusing and frustrating experience, it adds to the cognitive load required of users who now need to keep track of which skills work on which devices.

Each of the major personal assistants are strongly associated with names (Alexa, Siri, Cortana), genders, and even personalities—they tell jokes, answer questions with a cheeky tone, and have a “party mode.” We quickly get into the habit of talking to a person, and a person would give the same answer to a basic question if we ask them in the living room or on a hiking trail. So it’s maddening to realize that Siri on the iPhone can suggest the best route to get to work, but Siri on AppleTV has no idea. For intelligent assistants to live up to their promise, they need to offer the same skills across devices.

Lack of tailored recommendations

Google collects information about us every time we use it – restaurants we frequent, reviews we write online, and photos we take and post on social media. It doesn’t seem like much of a stretch to expect tailored suggestions from Google Assistant. In our opinion, for intelligent assistants to be truly revolutionary, they need to augment facts with assumptions to present us with tailored recommendations.

If we visit a new city, we want Google Assistant to not only show us a list of coffee shops, but for that list to be curated based on reviews and photos of cafes at home that we visit frequently or review favorably. These “probabilistic assertions” generated by the intelligent assistant may not be perfect, but we think the risk is better than sticking with untethered facts.

This is a huge gap in intelligent assistants today, made worse by the fact that Google and Amazon already use machine techniques to show us targeted ads “inspired by browsing history.” If Amazon knows that we like blue suede shoes and we ask Alexa about footwear, the results should at least take our preferences into consideration.

It’s early days in the intelligent assistant market—every player is still finding its niche and figuring how to make the largest impact. We’re excited to continue investigating in the space and to see how each assistant improves discovery, creates a consistent experience for customers, and taps into the wealth of data it already has about us to personalize the information it delivers.