Inclusive machine intelligence and its promise for speech-centered societal application

Shrikanth Narayanan,Dani Byrd

doi:10.1121/10.0008534

Abstract

Developments across the machine intelligence ecosystem, from sensing and computing to data sciences, are enabling new possibilities in advancing speech science and the creation of speech-centric societal technologies. A critical aspect of this endeavor requires addressing two intertwined challenges: illuminating the rich diversity in the speech across people and contexts and creating inclusive, trustworthy technologies that work for everyone. This talk will highlight three specific areas of advancement. The first is the capturing and modeling the human vocal instrument during speaking and selected related technological and clinical applications that leverage this technology (Hagedorn et al., 2019). The second relates to speech-based informatics tools for supporting screening, diagnosis and treatment of behavioral and mental health with broad access and scale. For example, remote multimodal sensing of speech cues can enable new ways for screening and tracking behaviors (e.g., stress) that can progress to treatment (e.g., for depression) and offer just-in-time support (Bone et al., 2017). The final domain highlights the use of speech machine intelligence tools to analyze media, including film, television shows, news and advertisements. These tools provide insights about representation and portrayals of individuals along dimensions of inclusion such as gender, age, ability, and other attributes (Somandepalli et al., 2021).

Full Text