Speaker Change Research Articles

IntroductionHabituation and novelty detection are two fundamental and widely studied neurocognitive processes. Whilst neural responses to repetitive and novel sensory input have been well-documented across a range of neuroimaging modalities, it is not yet fully understood how well these different modalities are able to describe consistent neural response patterns. This is particularly true for infants and young children, as different assessment modalities might show differential sensitivity to underlying neural processes across age. Thus far, many neurodevelopmental studies are limited in either sample size, longitudinal scope or breadth of measures employed, impeding investigations of how well common developmental trends can be captured via different methods. MethodThis study assessed habituation and novelty detection in N = 204 infants using EEG and fNIRS measured in two separate paradigms, but within the same study visit, at 1, 5 and 18 months of age in an infant cohort in rural Gambia. EEG was acquired during an auditory oddball paradigm during which infants were presented with Frequent, Infrequent and Trial Unique sounds. In the fNIRS paradigm, infants were familiarised to a sentence of infant-directed speech, novelty detection was assessed via a change in speaker. Indices for habituation and novelty detection were extracted for both EEG and NIRS ResultsWe found evidence for weak to medium positive correlations between responses on the fNIRS and the EEG paradigms for indices of both habituation and novelty detection at most age points. Habituation indices correlated across modalities at 1 month and 5 months but not 18 months of age, and novelty responses were significantly correlated at 5 months and 18 months, but not at 1 month. Infants who showed robust habituation responses also showed robust novelty responses across both assessment modalities. DiscussionThis study is the first to examine concurrent correlations across two neuroimaging modalities across several longitudinal age points. Examining habituation and novelty detection, we show that despite the use of two different testing modalities, stimuli and timescale, it is possible to extract common neural metrics across a wide age range in infants. We suggest that these positive correlations might be strongest at times of greatest developmental change.

Read full abstract

The performance of speaker recognition systems is very well on the datasets without noise and mismatch. However, the performance gets degraded with the environmental noises, channel variation, physical and behavioral changes in speaker. The types of Speaker related feature play crucial role in improving the performance of speaker recognition systems. Gammatone Frequency Cepstral Coefficient (GFCC) features has been widely used to develop robust speaker recognition systems with the conventional machine learning, it achieved better performance compared to Mel Frequency Cepstral Coefficient (MFCC) features in the noisy condition. Recently, deep learning models showed better performance in the speaker recognition compared to conventional machine learning. Most of the previous deep learning-based speaker recognition models has used Mel Spectrogram and similar inputs rather than a handcrafted features like MFCC and GFCC features. However, the performance of the Mel Spectrogram features gets degraded in the high noise ratio and mismatch in the utterances. Similar to Mel Spectrogram, Cochleogram is another important feature for deep learning speaker recognition models. Like GFCC features, Cochleogram represents utterances in Equal Rectangular Band (ERB) scale which is important in noisy condition. However, none of the studies have conducted analysis for noise robustness of Cochleogram and Mel Spectrogram in speaker recognition. In addition, only limited studies have used Cochleogram to develop speech-based models in noisy and mismatch condition using deep learning. In this study, analysis of noise robustness of Cochleogram and Mel Spectrogram features in speaker recognition using deep learning model is conducted at the Signal to Noise Ratio (SNR) level from −5 dB to 20 dB. Experiments are conducted on the VoxCeleb1 and Noise added VoxCeleb1 dataset by using basic 2DCNN, ResNet-50, VGG-16, ECAPA-TDNN and TitaNet Models architectures. The Speaker identification and verification performance of both Cochleogram and Mel Spectrogram is evaluated. The results show that Cochleogram have better performance than Mel Spectrogram in both speaker identification and verification at the noisy and mismatch condition.

Read full abstract

Speaker Change Research Articles

Related Topics

Articles published on Speaker Change

Comparative Analysis of Audio Features for Unsupervised Speaker Change Detection

Optimized technique for speaker changes detection in multispeaker audio recording using pyknogram and efficient distance metric.

Grounding with Structure: Exploring Design Variations of Grounded Human-AI Collaboration in a Natural Language Interface

Neural Evidence for Syntactic Unification in Second Language Sentence Comprehension: A Time‐Frequency Analysis

Tracking age-related changes in voice and speech production with Landmark-based analysis of speech.

Spoken Language Change Detection Inspired by Speaker Change Detection

The Role of the Proxemic Factor in the Implementation of the Cooperative Strategy of Communication in the English Fairy Tale Discourse

Speech recognition based on mobile sensor networks application in English education intelligent assisted learning system

Speech comprehension in the context of speaker changes: The importance of voice-feature continuity at the cocktail party

Compensation for coarticulation despite a midway speaker change: Reassessing effects and implications.

Exploring Innovative Teaching Methods for the Integration of Interactive Digital Media and Traditional Drawing Techniques

Claiming insufficient knowledge in pairwork and groupwork classroom activities

Communication between Astronauts and Nasa Deep Space Network: A Conversation Analysis

The use of online interactive teaching mode in university music teaching under the background of informationization

Transition-Relevance Places Machine Learning-Based Detection in Dialogue Interactions

Longitudinal fNIRS and EEG metrics of habituation and novelty detection are correlated in 1–18-month-old infants

Top-down effect of dialogue coherence on perceived speaker identity

Modeling of an Automatic Vision Mixer With Human Characteristics for Multi-Camera Theater Recordings

Analyzing Noise Robustness of Cochleogram and Mel Spectrogram Features in Deep Learning Based Speaker Recognition

Polymorphism of second person singular forms of address in the Spanish of Medellin, Colombia

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Speaker Change Research Articles

Related Topics

Articles published on Speaker Change

Comparative Analysis of Audio Features for Unsupervised Speaker Change Detection

Optimized technique for speaker changes detection in multispeaker audio recording using pyknogram and efficient distance metric.

Grounding with Structure: Exploring Design Variations of Grounded Human-AI Collaboration in a Natural Language Interface

Neural Evidence for Syntactic Unification in Second Language Sentence Comprehension: A Time‐Frequency Analysis

Tracking age-related changes in voice and speech production with Landmark-based analysis of speech.

Spoken Language Change Detection Inspired by Speaker Change Detection

The Role of the Proxemic Factor in the Implementation of the Cooperative Strategy of Communication in the English Fairy Tale Discourse

Speech recognition based on mobile sensor networks application in English education intelligent assisted learning system

Speech comprehension in the context of speaker changes: The importance of voice-feature continuity at the cocktail party

Compensation for coarticulation despite a midway speaker change: Reassessing effects and implications.

Exploring Innovative Teaching Methods for the Integration of Interactive Digital Media and Traditional Drawing Techniques

Claiming insufficient knowledge in pairwork and groupwork classroom activities

Communication between Astronauts and Nasa Deep Space Network: A Conversation Analysis

The use of online interactive teaching mode in university music teaching under the background of informationization

Transition-Relevance Places Machine Learning-Based Detection in Dialogue Interactions

Longitudinal fNIRS and EEG metrics of habituation and novelty detection are correlated in 1–18-month-old infants

Top-down effect of dialogue coherence on perceived speaker identity

Modeling of an Automatic Vision Mixer With Human Characteristics for Multi-Camera Theater Recordings

Analyzing Noise Robustness of Cochleogram and Mel Spectrogram Features in Deep Learning Based Speaker Recognition

Polymorphism of second person singular forms of address in the Spanish of Medellin, Colombia