It’s just a bot!
On the Humanization of Voice Assistants
Voice user interfaces (VUIs) and conversation design have changed dramatically over the years. No longer do we long for the voice interaction with robots from science fiction. With the latest wave of AI speakers and handheld voice assistants, what once was a dream is now a reality. Engineers and designers have drastically revolutionized the voice-enablement of … well, everything. While some AI enthusiasts would like to see more human-like conversations carried out with VUIs, others would like their interactions to be briefer as their primary use is utilitarian in nature. Such was the case for participants in a 2019 study presented in a paper titled, What Makes a Good Conversation? by Clark et. al.
What Makes a Good Conversation? | Proceedings of the 2019 CHI Conference on Human Factors in…
Conversational agents promise conversational interaction but fail to deliver. Efforts often emulate functional rules…dl.acm.org
In this study, participants were interviewed on their general attitudes towards conversations with strangers, acquaintances, friends, and conversational agents. Users of VUIs tend to engage in “limited” and “delineated task-based conversations” when interacting with them. With Human-Human conversations, the researchers found certain purposes and a set of attributes behind these interactions that they could be categorized by. The most common were Social and Transactional. Folks were either aiming to socialize with others or they had a more goal-oriented, information-gathering approach. The attributes included Trustworthiness, Mutual Understanding, Common Ground, Active Listening, and Humor. The participants in the study emphasized the importance of understanding the intent and meaning behind what was being exchanged, trustworthiness in the person(s) involved in the exchange, mutual willingness to participate in the conversation (active listening) and the importance of a two-way dialogue (active participation), and humor as drivers in the interactions.
Who cares. It’s just a bot, right?
How do these interaction purposes and attributes translate into VUI conversation design? In their observations, they found that the expectation of social aspects from human-human conversations is absent in human-computer interactions. The detachment and hesitation in fully trusting a stranger or acquaintance during a conversation remained consistent with how a person would approach a conversation with a computerized voice system, keeping to a very basic, more detached dialogue exchange.
Participants in the study mostly described VUI conversations as transactional, lacking the attributes associated with human-human conversations. This divide affects the way developers design their interactive conversations. For some use cases like voice-enabled video games or learning platforms, a more human-like interaction could be beneficial, but for cases such as a simple search agent or a smart home app, the utilitarian approach to the design makes sense.
So which one do you go with? That’s the beauty of it, it’s entirely up to you! It depends entirely on your use case, intended audience, and what you envision your voice app interaction to sound like. The more social, more human-like features have been presented in gaming and socializing conversational agents but when it came to the utilitarian applications, a straight forward exchange can be found. That being said, there are no hard and fast rules.
A common issue found with a “straight forward” dialogue exchanges is a common one for all interface types: they aren’t flexible enough for the user. Perhaps a user has a different mental mapping of how a sentence needs to be phrased or how they should carry out a search task. If we as interaction designers fail to build in the flexibility to account for some of these differences in our dialogue model, we can easily introduce unfriendly experiences to our end users. Even a basic exchange can have a complicated dialogue model depending on the use case. What may be a simple phrase searched on a GUI’s input field might be a two to four question exchange with a voice app.
While we can’t think of every possible form an individual may ask a question or approach their voice interfaces, we can certainly help by providing some flexibility and guide them toward the expected interaction flow. Whether your voice app calls for a more “human-like” interaction or not, staying true to your audience and keeping your dialogues flexible will get your app where it needs to be in order to keep your users coming back.