Privacy Preserving Personal Assistant with On-Device Diarization and Spoken Dialogue System for Home and beyond

Gérard Chollet,Jérôme Boudy,Fathy Yassa,Mossaab Hariz,Hugues Sansen,Yannis Tevissen,Christophe Lohr

doi:10.54941/ahfe1004577

Abstract

In the age of personal voice assistants, we have witnessed the proliferation of "personal" vocal companions across smartphones, smart speakers, and other smart devices. Yet, the question arises: Are these virtual assistants genuinely "personal"? The answer may surprise you. Most of these digital companions lack the ability to remember past interactions or truly understand who you are. They heavily rely on an internet connection to process your spoken words in remote servers. Even though users provide informed consent for these interactions, concerns linger regarding potential misuse of speech data, like invasive targeted advertising. The advent of high-performance co-processors, such as GPUs and TPUs, in modern smartphones has rendered cloud-based speech processing obsolete, paving the way for local, on-device solutions.Personal assistants for the elderly serve a unique role, requiring functionalities distinct from those catering to digital natives. Notably, they must excel at aiding memory recall during conversations, making them invaluable in scenarios like medical examinations. By documenting and contextualizing exchanges during medical visits through diarization, a personal assistant can empower individuals or caregivers to revisit and understand the details at their convenience. This autonomy necessitates operation without an internet connection, ensuring utmost privacy during such sensitive interactions.The e-ViTA project has successfully developed a versatile conversational application with a rich set of features:• Local use on both Android and iOS smartphones, no internet connection required.• The capability to remember previous interactions.• Speaker recognition for personalized experiences.• Local processing for automatic speech recognition, spoken language understanding, dialogue management, and speech synthesis.• Secure web searches after anonymizing requests.• The ability to handle telephone calls, read emails, SMS, and messages.• Text preparation through voice dictation.• Assistance with daily activities and acting as a companion or butler.• Facilitating inter-lingual communication via integration with TalkMondo, among other functions.Unlike facial recognition, vocal recognition, and speaker differentiation provide a less invasive and cost-effective solution. Being based on the smartphone's microphone, they do not rely on the camera, which would necessitate complex mechanisms to track the speaker's position.This paper highlights the critical importance of speaker diarization, which allows the system to preserve users' conversations while ensuring the highest level of privacy. Additionally, when deployed on embedded devices, this technology can contribute to monitoring the well-being of the elderly, offering vital contextual information enriched by domotics sensors (motion, intrusion, door or window sensors), actimetry sensors from smartphones or smartwatches, or weather station. The data fusion of these different data streams leverages more personalized and optimized assistance and services, through user-adapted dialogues, or the elderly based on his context and activity.In conclusion, the ability of a system to generate personalized dialogue synthesis is pivotal in the realm of personal voice assistants. With secure, local processing and advanced features, such as speaker differentiation and diarization, enriched by sensor data fusion, we can ensure that virtual companions truly cater to the individual needs of users, without compromising their privacy or data security. This marks a significant step towards a more "personal" experience with our digital assistants.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Privacy Preserving Personal Assistant with On-Device Diarization and Spoken Dialogue System for Home and beyond

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

End-to-End Spoken Language Understanding: Performance analyses of a voice command task in a low resource setting
Thierry Desot ... Michel Vacher
Computer Speech & Language | VOL. 75
Thierry Desot, et. al.Thierry Desot ... Michel Vacher
07 Mar 2022
Computer Speech & Language | VOL. 75

Marketing via smart speakers: what should Alexa say?
Katherine Taken Smith
Journal of Strategic Marketing | VOL. 28
Katherine Taken SmithKatherine Taken Smith
07 Nov 2018
Journal of Strategic Marketing | VOL. 28

Speech Recognition and Spoken Language Understanding for Mobile Personal Assistants: A Case Study of "Shabette Concier"
Kosuke Tsujino ... Shinya Iizuka
-
Kosuke Tsujino, et. al.Kosuke Tsujino ... Shinya Iizuka
01 Jun 2013
01 Jun 2013

An agent-based approach to dialogue management in personal assistants
Anh Nguyen ... Wayne Wobcke
-
Anh Nguyen, et. al.Anh Nguyen ... Wayne Wobcke
10 Jan 2005
10 Jan 2005

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Privacy Preserving Personal Assistant with On-Device Diarization and Spoken Dialogue System for Home and beyond

Abstract

Talk to us

Similar Papers