Spoken Term Research Articles

Speech signals convey various yet mixed information ranging from linguistic to speaker-specific information. However, most of acoustic representations characterize all different kinds of information as whole, which could hinder either a speech or a speaker recognition (SR) system from producing a better performance. In this paper, we propose a novel deep neural architecture (DNA) especially for learning speaker-specific characteristics from mel-frequency cepstral coefficients, an acoustic representation commonly used in both speech recognition and SR, which results in a speaker-specific overcomplete representation. In order to learn intrinsic speaker-specific characteristics, we come up with an objective function consisting of contrastive losses in terms of speaker similarity/dissimilarity and data reconstruction losses used as regularization to normalize the interference of non-speaker-related information. Moreover, we employ a hybrid learning strategy for learning parameters of the deep neural networks: i.e., local yet greedy layerwise unsupervised pretraining for initialization and global supervised learning for the ultimate discriminative goal. With four Linguistic Data Consortium (LDC) benchmarks and two non-English corpora, we demonstrate that our overcomplete representation is robust in characterizing various speakers, no matter whether their utterances have been used in training our DNA, and highly insensitive to text and languages spoken. Extensive comparative studies suggest that our approach yields favorite results in speaker verification and segmentation. Finally, we discuss several issues concerning our proposed approach.

This article investigates the Southern Vowel Shift—a possibly interrelated series of rotations in vowel space currently affecting the dialects of southern speakers—in terms of examining its classification as a chain-shift process and, more generally, providing a descriptive account of the phonetic character of the changes in each individual vowel class. Based on the work of Labov (1991, 1994) and Feagin (1986), it has been suggested that the Southern Shift involves changes in both the front vowels and the back vowels, with the tense and lax front vowel nuclei essentially switching places and the back vowels moving forward. The relationship of these changes in the front vowels and those in the back vowels has not been firmly established, but they appear to be driven by different social and linguistic forces. What is happening to the low front, the mid, and the low back vowel classes in the Southern Shift has only been superficially explored. A detailed instrumental analysis of the vowel systems of 25 native Memphians of selected ages, socioeconomic classes, and genders is presented, revealing the movement of vowel classes which seem to be playing an important role in the instigation or perpetuation of the Southern Shift. This analysis points out discrepancies about how previously cited vocalic changes are embedded in mid-southern speech and provides a picture of how these changes are affecting other changes in the system. The results suggest that, while many of the changes cited in the literature are indeed present in the sample, the interrelatedness of these changes and their prognosis to move to completion are not so clear.

Spoken Term Research Articles

Related Topics

Articles published on Spoken Term

Learning Speaker-Specific Characteristics With a Deep Neural Architecture

A contrastive study of verbs of remembering and forgetting in English and Spanish

Web Based Hindi to Punjabi Machine Translation System

Lattice-based Indexing for Spontaneous Mandarin Speech

The Clinical Observation on the Curative Effect of Integrated Traditional Chinese and Western Medicine Treating 152 Cases of Lung Tuberculosis

‘Is English we speaking’: Trinbagonian in the twenty-first century

The myth of a homogeneous speech Community: a sociolinguistic study of the speech of Japanese women in diverse gender roles

The Southern Shift in Memphis, Tennessee

Meaning, belief, and language acquisition

Evaluative communications between affectively ill and well mothers and their children.

Methods for the analysis of two‐party question and answer dialogues

Observations on elicited language imitation with the severely retarded.

Multidimensional classification of abnormal voices

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Spoken Term Research Articles

Related Topics

Articles published on Spoken Term

Learning Speaker-Specific Characteristics With a Deep Neural Architecture

A contrastive study of verbs of remembering and forgetting in English and Spanish

Web Based Hindi to Punjabi Machine Translation System

Lattice-based Indexing for Spontaneous Mandarin Speech

The Clinical Observation on the Curative Effect of Integrated Traditional Chinese and Western Medicine Treating 152 Cases of Lung Tuberculosis

‘Is English we speaking’: Trinbagonian in the twenty-first century

The myth of a homogeneous speech Community: a sociolinguistic study of the speech of Japanese women in diverse gender roles

The Southern Shift in Memphis, Tennessee

Meaning, belief, and language acquisition

Evaluative communications between affectively ill and well mothers and their children.

Methods for the analysis of two‐party question and answer dialogues

Observations on elicited language imitation with the severely retarded.

Multidimensional classification of abnormal voices