Distinctive Phonetic Features Modeling and Extraction Using Deep Neural Networks

Yasser Seddiq,Sid-Ahmed Selouani,Yousef A Alotaibi,Ali Hamid Meftah

doi:10.1109/access.2019.2924014

Abstract

Feature extraction is a critical stage of digital speech processing systems. Quality of features is of great importance to provide a solid foundation upon which the subsequent stages stand. Distinctive phonetic features (DPFs) are one of the most representative features of the speech signals. The significance of DPFs is in their ability to provide abstract description of the places and manners of articulation of the language phonemes. A phoneme's DPF element reflects unique articulatory information about that phoneme. Therefore, there is a need to discover and investigate each DPF element individually in order to achieve a deeper understanding and to come up with a descriptive model for each one. Such fine-grained modeling will satisfy the uniqueness of each DPF element. In this paper, the problem of DPF modeling and extraction of modern standard Arabic is tackled. Due to the remarkable success of deep neural networks (DNNs) that are initialized using deep belief networks (DBNs) in serving DSP applications and its capability of extracting highly representative features from the raw data, we exploit its modeling power to investigate and model the DPF elements. DNN models are compared with the classical multilayer perceptron (MLP) models. The representativeness of several acoustic cues for different DPF elements was also measured. This paper is based on formalizing DPF modeling problem as a binary classification problem. Because the DPF elements are highly imbalanced data, evaluating the quality of models is a very tricky process. This paper addresses the proper evaluation measures satisfying the imbalanced nature of the DPF elements. After modeling each element individually, the two top-level DPF extractors are designed: MLP- and DNN-based extractors. The results show the quality of DNN models and their superiority over MLPs with accuracies of 89.0% and 86.7%, respectively.

Highlights

Feature extraction is an essential preprocessing stage of digital speech processing systems serving several applications such as automatic speech recognition (ASR), speaker identification, speech prosody analysis, and many others
This paper reports the work of modeling Distinctive phonetic features (DPFs) of Modern Standard Arabic (MSA)
The study reported in [10] addressed DPF element extraction of American English using a multilayer perceptron (MLP) that was modeled using Deep Neural Networks (DNNs)

Summary

INTRODUCTION

Feature extraction is an essential preprocessing stage of digital speech processing systems serving several applications such as automatic speech recognition (ASR), speaker identification, speech prosody analysis, and many others. A DPF vector is a set of binary elements that uniquely describes the articulatory and phonetic properties of phonemes [1]. That is, generating the phoneme /b/ involves vocal folds’ vibration, which is a brain activity that is described by setting the voicing element as ‘‘+’’. The ability of DPFs to describe speech signal contextually and phonetically makes them of great advantage in enhancing systems performance and robustness [4]. Those benefits can be maximized if language-specific studies are conducted.

DPF EXTRACTION

BACKGROUND

PARAMETER FINE-TUNING

Short-time Energy

Binary Voicing

RESULTS AND DISCUSSION

CONCLUSIONS

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2019
Citations: 7	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Distinctive Phonetic Features Modeling and Extraction Using Deep Neural Networks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Sequence-to-Sequence Acoustic-to-Phonetic Conversion Using Spectrograms and Deep Learning
Mustafa A Qamhan ... Ali Hamid Meftah
IEEE Access | VOL. 9
Mustafa A Qamhan, et. al.Mustafa A Qamhan ... Ali Hamid Meftah
01 Jan 2020
IEEE Access | VOL. 9

Optimizing neural networks for medical data sets: A case study on neonatal apnea prediction.
Krishnamoorthi Makkithaya ... Shikhar Srivastava
Artificial intelligence in medicine | VOL. 98
Krishnamoorthi Makkithaya, et. al.Krishnamoorthi Makkithaya ... Shikhar Srivastava
01 Jul 2019
Artificial intelligence in medicine | VOL. 98

A new look at the automatic mapping between Arabic distinctive phonetic features and acoustic cues
Mohammed Sidi Yakoub ... Yasser Seddiq
-
Mohammed Sidi Yakoub, et. al.Mohammed Sidi Yakoub ... Yasser Seddiq
01 Jul 2017
01 Jul 2017

Distinctive phonetic feature extraction for robust speech recognition
T Fukuda ... W Yamamoto
-
T Fukuda, et. al.T Fukuda ... W Yamamoto
06 Apr 2003
06 Apr 2003

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Distinctive Phonetic Features Modeling and Extraction Using Deep Neural Networks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access