Voice Modalities Research Articles

As part of a multidisciplinary research and a PhD project to strengthen the connection between retired couples living at home, we imagined and designed Yōkobo. It is a robot at the crossroads of a sensitive approach and a robotic trend that bridges the gap between humans (Human-Robot-Human Interactions field). As a theoretical contribution, Yōkobo is at the intersection of various concepts: behavioral objects, robjects, weak robotics, and slow technology. Yōkobo is a trinket bowl placed in the entrance of homes. Its discreet presence expresses hospitality and celebrates small moments of everyday life, welcoming visitors and inhabitants of the house. The name comes from the contraction of “yōkoso” (welcome in Japanese) and “robot” (with French pronunciation). In addition to these functions, Yōkobo expresses the state of the home using data from connected IoT devices, combining various house parameters (such as temperature, air quality, etc.) to express the home's “mood” through its motions. Finally, Yōkobo used in tandem with house keys, can convey a trace, a message based on motion. And a trace is a memory of the partner's passage. Yōkobo is resolutely innovative and disruptive. It does not sit within the lineage of the general vision of what robots are and what they can do: it is an object intended to be unobtrusive, stemming from ambient computing, while having an ongoing subtle presence. It does not make sounds, unlike voice assistants and the trend for using voice modality interaction. It expresses its environment only through motion and light. to move away from home's companion robots and the biases they can generate through facial representation, Yōkobo has neither an anthropomorphic shape nor can talk. Yōkobo is intended to be made of natural materials such as ceramic, wood, or wool to break with the idea of plastic, disposable, and toy robots, and to improve its integration in everyday home life. as a slow technology product, understanding and integrating Yōkobo into one's life takes time and requires accepting not having a clear, repetitive, and instantaneous response to an action. Its contribution is not measured in terms of efficiency and utility; it is the sum of different experiences with the product over time that creates the object's meaning and value. Getting to know Yōkobo's expressive motions is continuous and progressive. Yōkobo is an object that is understood through perception and touches the poetic sensibility of its users. Yōkobo is a concept that puts people's relationships at the center. It does not impose itself to propose an exclusive Human-Object relationship. It reveals the presence of the other by expressing the last impermanent trace of the other's passage. It is an object of sensitive presence. This work is the result of interdisciplinary research between roboticists, designers, and ergonomists. The navigation (directions and overlay) of this pan.able demonstrates the design and engineering processes, as well as the interaction modalities.

Read full abstract

Cross-modal effects provide a model framework for investigating hierarchical inter-areal processing, particularly, under conditions where unimodal cortical areas receive contextual feedback from other modalities. Here, using complementary behavioral and brain imaging techniques, we investigated the functional networks participating in face and voice processing during gender perception, a high-level feature of voice and face perception. Within the framework of a signal detection decision model, Maximum likelihood conjoint measurement (MLCM) was used to estimate the contributions of the face and voice to gender comparisons between pairs of audio-visual stimuli in which the face and voice were independently modulated. Top–down contributions were varied by instructing participants to make judgments based on the gender of either the face, the voice or both modalities (N = 12 for each task). Estimated face and voice contributions to the judgments of the stimulus pairs were not independent; both contributed to all tasks, but their respective weights varied over a 40-fold range due to top–down influences. Models that best described the modal contributions required the inclusion of two different top–down interactions: (i) an interaction that depended on gender congruence across modalities (i.e., difference between face and voice modalities for each stimulus); (ii) an interaction that depended on the within modalities’ gender magnitude. The significance of these interactions was task dependent. Specifically, gender congruence interaction was significant for the face and voice tasks while the gender magnitude interaction was significant for the face and stimulus tasks. Subsequently, we used the same stimuli and related tasks in a functional magnetic resonance imaging (fMRI) paradigm (N = 12) to explore the neural correlates of these perceptual processes, analyzed with Dynamic Causal Modeling (DCM) and Bayesian Model Selection. Results revealed changes in effective connectivity between the unimodal Fusiform Face Area (FFA) and Temporal Voice Area (TVA) in a fashion that paralleled the face and voice behavioral interactions observed in the psychophysical data. These findings explore the role in perception of multiple unimodal parallel feedback pathways.

Read full abstract

Voice Modalities Research Articles

Related Topics

Articles published on Voice Modalities

Feature-level fusion of face and speech based multimodal biometric attendance system with liveness detection

Enhancing Biometric Security with Bimodal Deep Learning and Feature-level Fusion of Facial and Voice Data

Multimodal Biometric Authentication System Using of Autoencoders and Siamese Networks for Enhanced Security

Understanding is a Two-Way Street: User-Initiated Repair on Agent Responses and Hearing in Conversational Interfaces

Designing an age-friendly conversational AI agent for mobile banking: the effects of voice modality and lip movement

Design and Feasibility Analysis of a Smartphone-Based Digital Cognitive Assessment Study in the Framingham Heart Study.

Drivers' Comprehensive Emotion Recognition Based on HAM.

A Human-Centered and Adaptive Robotic System Using Deep Learning and Adaptive Predictive Controllers

P‐2.25: Research on Virtual Reality Field Based on Multimodal Emotion Recognition

Which components of famous people recognition are lateralized? A study of face, voice and name recognition disorders in patients with neoplastic or degenerative damage of the right or left anterior temporal lobes

Yōkobo

Is voice really persuasive? The influence of modality in virtual assistant interactions and two alternative explanations

Understanding Emerging Media: Voice, Agency, and Precarity in the Post-2011 Arab Mediasphere

Subverting the Electronic Language Textbook to Make it Relevant

The Role of Unimodal Feedback Pathways in Gender Perception During Activation of Voice and Face Areas.

Hypnosis as a Supplemental Treatment for and a Prevention of Muscle Tension Dysphonia

BIOMEX-DB: A Cognitive Audiovisual Dataset for Unimodal and Multimodal Biometric Systems

Neural Correlates of Esophageal Speech: An fMRI Pilot Study

Security in smart cities: A brief review of digital forensic schemes for biometric data

ДВУХКОМПОНЕНТНАЯ СЕТЕВАЯ МОДЕЛЬ В ТЕХНОЛОГИЯХ ГОЛОСОВОЙ ИДЕНТИФИКАЦИИ ЛИЧНОСТИ

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Voice Modalities Research Articles

Related Topics

Articles published on Voice Modalities

Feature-level fusion of face and speech based multimodal biometric attendance system with liveness detection

Enhancing Biometric Security with Bimodal Deep Learning and Feature-level Fusion of Facial and Voice Data

Multimodal Biometric Authentication System Using of Autoencoders and Siamese Networks for Enhanced Security

Understanding is a Two-Way Street: User-Initiated Repair on Agent Responses and Hearing in Conversational Interfaces

Designing an age-friendly conversational AI agent for mobile banking: the effects of voice modality and lip movement

Design and Feasibility Analysis of a Smartphone-Based Digital Cognitive Assessment Study in the Framingham Heart Study.

Drivers' Comprehensive Emotion Recognition Based on HAM.

A Human-Centered and Adaptive Robotic System Using Deep Learning and Adaptive Predictive Controllers

P‐2.25: Research on Virtual Reality Field Based on Multimodal Emotion Recognition

Which components of famous people recognition are lateralized? A study of face, voice and name recognition disorders in patients with neoplastic or degenerative damage of the right or left anterior temporal lobes

Yōkobo

Is voice really persuasive? The influence of modality in virtual assistant interactions and two alternative explanations

Understanding Emerging Media: Voice, Agency, and Precarity in the Post-2011 Arab Mediasphere

Subverting the Electronic Language Textbook to Make it Relevant

The Role of Unimodal Feedback Pathways in Gender Perception During Activation of Voice and Face Areas.

Hypnosis as a Supplemental Treatment for and a Prevention of Muscle Tension Dysphonia

BIOMEX-DB: A Cognitive Audiovisual Dataset for Unimodal and Multimodal Biometric Systems

Neural Correlates of Esophageal Speech: An fMRI Pilot Study

Security in smart cities: A brief review of digital forensic schemes for biometric data

ДВУХКОМПОНЕНТНАЯ СЕТЕВАЯ МОДЕЛЬ В ТЕХНОЛОГИЯХ ГОЛОСОВОЙ ИДЕНТИФИКАЦИИ ЛИЧНОСТИ