Purpose This paper aims to investigate user voice-switching behavior in voice assistants (VAs), embodiments and perceived trust in information accuracy, usefulness and intelligence. The authors addressed four research questions: RQ1. What is the nature of users’ voice-switching behavior in VAs? RQ2: What are user preferences for embodied voice interfaces (EVIs), and do their preferred EVIs influence their decision to switch the voice on their VAs? RQ3: What are the users’ perceptions of their VAs concerning: a. information accuracy, b. usefulness, c. intelligence and d. the most important characteristics they must possess? RQ4: Do users prefer their voice interface to match their characteristics (age, gender, accent and race/ethnicity)? Design/methodology/approach The authors used a 52-question survey questionnaire to collect quantitative and qualitative data. The population was undergraduate students (freshmen and sophomores) at a research university in the USA. The students were enrolled in two required courses with a research participation assignment offered for credits. Students must register for research participation credits in the SONA Research Participation System www.sona-systems.com/platform/research-management/ Registered students cannot be invited or sampled to participate in a research study. There were 1,700 students enrolled in both courses. After the survey’s URL was posted in SONA, the authors received (n = 632) responses. Of these, (n = 150) completed the survey and provided valid responses. Findings Participants (43%) switched the voice interface in their VAs. They preferred American and British accents but trusted the latter. The British accent with a male voice was more trusted than the American accent with a female voice. Voice-switching decisions varied in the case of most and least preferred EVIs. Participants preferred EVIs that matched their characteristics. Most trusted their VAs’ information accuracy because they used the internet to find information, reflecting inadequate mental models. Lack of trust is attributed to misunderstanding requests and inability to respond accurately. A significant correlation was found between the participants’ perceived intelligence of their VAs and trust in information accuracy. Research limitations/implications Due to the wide variability in the data (e.g. 84% White, 6% Asian and 6% Black), the authors did not perform a statistical test to identify the significance between the selected EVIs and participants’ races or ethnicities. The self-reported survey questionnaire may be prone to inaccuracy. The participants’ interest in earning research credit for participation in this study and using SONA is a potential bias. The EVIs the authors used as embodiments are limited in their representation of people from diverse backgrounds, races, ethnicities, ages and genders. However, they could be examples for building prototypes to test in VAs. Practical implications Educators and information professionals should lead the way in offering artificial intelligence (AI) literacy programs to enable young adults to form more adequate mental models of VAs and support their learning and interactions. VA designers should address the failures and other issues the participants experienced in VAs to minimize frustrations. They should also train machine learning models on large data sets of complex queries to augment success. Furthermore, they should consider augmenting VAs’ personification with EVIs to enrich voice interactions and enhance personalization. Researchers should use a mixed research method with data triangulation instead of only a survey. Social implications There is a dire need to teach young adults AI literacy skills to enable them to build adequate mental models of VAs. Failures in VAs could affect users’ willingness to use them in the future. VAs can be effective teaching and learning tools, supporting students’ autonomous and personalized learning. Integrating EVIs with diverse characteristics could advance inclusivity in designing VAs and support personalization beyond language, accent and gender. Originality/value This study advances research on user voice-switching behavior in VAs, which has hardly been investigated in VA research. It brings attention to users’ experiential learning and the need for exposure to AI literacy to enable them to form adequate mental models of VAs. This study contributes to research on personifying VAs through EVIs with diverse characteristics to visualize voice interactions. Reasons for not switching the voice interface due to satisfaction with the current voice or a lack of knowledge of this feature did not support the status quo theory. Incorporating satisfaction and lack of knowledge as new factors could advance this theory. Switching the voice interface to avoid visualizing the least preferred EVIs in VAs is a new theme emerging from this study. Users’ trust in VAs’ information accuracy is intertwined with perceived intelligence and usefulness, but perceived intelligence is the strongest factor influencing trust.
Read full abstract