Punchcut Labs research provides a snapshot of users’ experience with voice agents and points to the future of voice user interface.
Technology was supposed to set us free but the way most of us interact with our devices — keyboards, touchscreens, buttons — keeps us tethered and tied to a paradigm that seems overdue for disruption.
Enter voice. Our silent relationship with technology is changing with the advent of voice agents. Already, voice has moved from fringe technology to foundational technology
A new generation of voice-activated products have quickly taken a prominent place in our lives, whether it’s asking Siri where the nearest In-N-Out Burger is located or asking Alexa for info on your upcoming flight. Voice interface is quickly evolving from a command-and-query mode into something more conversational, nuanced and personalized — which holds the promise of a companion-like relationship in the future (although we’re not there yet).
How is voice interface changing the way we interact with technology? How should companies go about integrating voice user experiences into products and services?
As a design consultancy, Punchcut views technological innovation through the lens of how it will impact the design of products, interfaces and UX. Voice is more than an add-on or a novelty—when approached holistically as part of the overall user experience it can enhance engagement and transform even everyday appliances and devices.
Recently, our team launched a research study to explore American consumers impressions of voice technology and how they are using it right now. We conducted an online national survey among 486 participants (ages 18 – 75), all of whom have used a voice assistant (Apple’s Siri, Google’s Assistant, Microsoft’s Cortana, Amazon’s Alexa, Samsung’s Bixby, or an in-car voice agent). Through quantitative methods, we also captured user attitudes, behaviors, and satisfaction levels with voice in its current state. These responses lead us to several conclusions regarding the state of voice.
01
Voice is no longer a novelty — it’s becoming a mainstream technology
Our survey respondents were about evenly split between men and women, representing multiple age groups and areas all across the country. They were invited to complete a survey about their experience with voice UI, which makes them, as a group, slightly more likely than the greater population to have experience with voice UI. However, responses indicate an increasing awareness and usage of voice UI. In answer to our question “Have you tried using a voice assistant before?” 77% of respondents replied that they had. Siri is by far the voice UI with the highest rate of exposure. 75% of respondents answered that they have tried Siri at least once, followed by OK Google (34%), Alexa (26%), and Cortana (12%).
Added to this high percentage of awareness is a sense that voice is becoming a more regular and satisfying interaction. A majority of users (51%) are stating that that they use a voice UI regularly — either daily or weekly. By contrast, 19% of respondents said they “never or almost never” use a voice assistant. A majority (60%) of respondents also responded that they are “very or somewhat satisfied” with using voice “as of today” with only 3% of respondents saying that they are “very dissatisfied” with the experience.
02
Voice is location-specific
When we probed further, we found that users strongly associate using voice interfaces with specific places. We asked users to name the location where they use voice features the most and the leading response was “in the car” with 49% of respondents. In second place was “at home” with 35% of respondents. In third place with a much smaller percentage was “on the go” with 13% of respondents agreeing. “At work” received only slightly more than 1% of reponses.
This ranking correlates to the nature of these spaces. The choices in this survey divide neatly between private spaces where the user is most likely to be alone — “in the car,” “at home” — and more public spaces such as “ on the go” and “at work.” This is not surprising given the social norm against using using voice with a device in public (think of how we feel when seated next to someone engaged in a conversation on their cell phone). The assumption that users feel more comfortable interacting with a voice agent when alone versus in a social setting appears to be confirmed by responses to the following questions:
Users strongly stated that they were most comfortable using voice interfaces when they were alone. However, when users’ were asked which environment they were most likely to use voice, “in the car” and “at home” were still far and away the most popular choices, regardless of whether the user was alone or not.
These responses reinforce our contention that people feel less comfortable using voice assistants in public spaces, and more comfortable in private spaces. In private spaces users felt less of a distinction between being “alone” or “in a social setting.” We also feel that for some activities within the car and home contexts users will engage a voice assistant whether they are alone or not.
03
Users value voice when they need to be hands free
As we’ve pointed out, users feel that there are certain environments where they are more likely to value a voice assistant. Next we wanted to probe around the specific activities within those environment that seemed best suited for voice.
Our first point of inquiry was how users’ ranked voice compared to other types of inputs. According to the users we surveyed, physical input is still strongly preferred over voice as an input. 68% of users selected some kind of physical input (touching a screen, pressing a key or clicking a button) versus 32% who indicated they prefered using a voice assistant.
A key value for voice interaction appears to be that it is hands-free — 68% said they liked the hands-free aspect of using a voice assistant.
This seems to indicate that users perceive voice as an additive choice — a convenient new way to interact with their devices but not necessarily a replacement for established inputs. Respondents choose voice as an input method when their hands are otherwise occupied, or it would be awkward to navigate to a manual input. When we asked users “during which type of activity do you use voice?” driving was ranked by our respondents as the the most popular activity. Respondents selected it at a rate of more than two times the second-ranking activity, messaging. Driving is an activity that requires both hands and eyes be occupied, and it is often a private activity. Both of these are conditions in which our respondents said that they valued voice. This is backed up by the data from other questions on the survey, such as the 30% of our respondents who indicated that they had experience with in-car voice assistance.
04
Users associate voice with specific tasks
The use of voice is closely associated with specific tasks. “At home” was chosen by 35% of our respondents as the place where they use voice features the most. Our survey probed further to determine which features and devices users value for voice in the home. We asked “when using voice in your home, which of the following features would be of most interest?” Only four features of the seven that we surveyed received more than 10% of the total responses:
When using voice interfaces in your home, which of the following features would be of most interest?
• Playing music – 30%
• Making a grocery list – 25%
• Turning on a household appliance – 18%
• Cooking guidance – 13%
• All others – 8%
When we probed around which devices in the home our respondents valued, a few trends emerged. The first was that “none” ranked in the top three responses, whether respondents were ranking household appliances, entertainment devices or smart home devices.
“None” was far and away the most popular in the response when we polled respondents on whether they valued pairing voice with household appliances, despite the fact that respondents said that they found some value in the feature of turning on a household device with voice. When respondents were considering the value of voice as a feature of entertainment and smart home devices, “none” still ranked third place.
The high rank of “none” when respondents were valuing voice paired with certain devices reflects users’ current experiences with device voice agents and the tasks that they can perform with these devices. Within all three categories, the devices that appeared to valued most were:
• TV
• Mobile phone
• Lighting
• Thermostat
The popularity of these devices can be linked to Alexa and Siri’s established presence on the TV and Siri’s on the Mobile Phone, and Alexa and Google Now’s increasing role in controlling smart things in the home.
Knowing that respondents want voice functionality in the Kitchen more than any other space, we might expect to see appliances typically found in the kitchen to be valued for voice, but that is not the case. We can make sense of these two apparently conflicting responses when we ask the following:
- Is this a space where the user is more likely to want to interact with his or her digital device hands free?
- Can the user perform the most valued tasks for voice – playing music, making a grocery list, turning on a household appliance, or cooking guidance – in this space?
This also helps explain why the living room and bedroom rank high: as a stationary user within these spaces (on the couch, or in bed, for instance) may want to use voice to interact with devices that are physically far away.
05
Users see voice as a tool, not a companion (yet)
We saw a strong divergence between how respondents value voice UI for executing tasks correctly but don’t yet value voice agents who can engage them in conversation.
Users strongly agreed with the statement “A voice assistant that can obey simple commands is important to me” but responses were more evenly divided between Strongly Agree/Agree, Neutral and Disagree/Strongly Disagree when presented the statement: “A voice assistant that can carry on a conversation is important to me”
This preference — accuracy over conversation —was backed up by data from other questions. Respondents ranked “Social Conversation” as the feature of least interest when using voice in the home (4%) and “carry on a conversation” was ranked second to last when respondents were asked what they wish voice did better. Improvements in accuracy — 63% of respondents chose (“understand what I am saying” and 38% selected “give me the results I ask for”) was what respondents want to see improved in their voice user experiences.
06
Users’ want voice UI to anticipate their needs, but they’re concerned about privacy
While respondents prioritize accuracy in voice agents, they remain open to the idea of voice agents anticipating their needs. In two questions, respondents leaned strongly toward agreeing with statements that depicted voice agents “automatically” anticipating needs or communicating with other devices or appliances.
However, this desire for automatic and predictive features is tempered by a wariness over privacy. Survey participants stated that they were uncomfortable with the idea of an “always on” device. Only 30% stated that they were comfortable with a voice assistant that is always on and 72% of respondents stated that they would prefer to manually turn on the voice assistant over having it always listening.
Our survey did not provide details of a specific feature or company when asking this question, and it may be that these responses adjust if users’ are presented with the right value proposition, but they highlight the fact that addressing users’ privacy concerns should be a part of a well-developed voice user interface (VUI) strategy.
Conclusion
Voice user interface (VUI) represents a true leap forward in how we interact with technology. The human feel of a well-designed agent, complete with natural sounding voices and the nuances of language, creates the presence of a living intelligence.
Our survey results clearly show that voice interaction has moved from novelty to the mainstream. The widespread adoption of Siri, Alexa and other voice agents has made users comfortable with voice, especially in contexts that are private and where the benefits of hands-free interaction outweigh a preference for manual inputs.
Taking voice to the next stage — where users engage in conversation and value the voice agent as a companion — will require both time and sophisticated holistic design to overcome built-in resistance. Concerns about privacy and reluctance to deploy voice in public contexts will require creating compelling use cases and a concerted effort to reassure users.
This survey represents a snapshot that captures where voice is today — but it also hints at the hurdles designers will need to overcome to evolve voice user experience from its current mode as a digital servant to a true companion. Additional voice user experience (VUX) insights along strategic voice user interface design (VUI) principles will be critical to ensure the most intuitive and valuable experiences in the future. The survey is part of ongoing qualitative and quantitative research at Punchcut that guides our design practice and helps us predict how the user experience will be enhanced by technologies as they emerge. Voice user experience design and voice user interface design
A Punchcut Perspective
Contributors: Joy Wong Daniels, Jodi Burke, Reggie Wirjadi, Ken Olewiler