SPEAKER IDENTIFICATION MODEL BASED ON DEEP NURAL NETWOKS

Saadaldeen Rashid Ahmed Saadaldeen Rashid Ahmed,Mohammed Rashid Ahmed Mohammed Rashid Ahmed,Baraa Taha Yasen Baraa Taha Yasen,Hameed Mutlag Farhan Hameed Mutlag Farhan,Adil Deniz Duru Adil Deniz Duru,Zainab Ali Abbood Zainab Ali Abbood

doi:10.52866/ijcsm.2022.01.01.012

Saadaldeen Rashid Ahmed Saadaldeen Rashid Ahmed, Mohammed Rashid Ahmed Mohammed Rashid Ahmed + Show 4 more

Open Access

https://doi.org/10.52866/ijcsm.2022.01.01.012

Copy DOI

Abstract

This study aims is to establish a small system of text-independent recognition of speakers for a relatively small group of speakers at a sound stage. The fascinating justification for the International Space Station (ISS) to detect if the astronauts are speaking at a specific time has influenced the difficulty. In this work, we employed Machine Learning Applications. Accordingly, we used the Direct Deep Neural Network (DNN)-based approach, in which the posterior opportunities of the output layer are utilized to determine the speaker’s presence. In line with the small footprint design objective, a simple DNN model with only sufficient hidden units or sufficient hidden units per layer was designed, thereby reducing the cost of parameters through intentional preparation to avoid the normal overfitting problem and optimize the algorithmic aspects, such as context-based training, activation functions, validation, and learning rate. Two commercially available databases, namely, TIMIT clean speech and HTIMIT multihandset communication database and TIMIT noise-added data framework, were tested for this reference model that we developed using four sound categories at three distinct signal-to-noise ratios. Briefly, we used a dynamic pruning method in which the conditions of all layers are simultaneously pruned, and the pruning mechanism is reassigned. The usefulness of this approach was evaluated on all the above contact databases

Highlights

This work focuses on Deep Neural Network (DNN)-based text-independent, closed-set speaker identification for a relatively limited number of users
The above-stated problem and associated goals were inspired by the requirements of the NASA Johnson Space Center (JSC) for their application to the International Space Station (ISS)
DNNs require a considerable amount of data for training

Summary

Introduction

This work focuses on Deep Neural Network (DNN)-based text-independent, closed-set speaker identification for a relatively limited number of users. The above-stated problem and associated goals were inspired by the requirements of the NASA Johnson Space Center (JSC) for their application to the International Space Station (ISS). The ISS application requires a low-complexity solution with a low power consumption and a small footprint. DNNs can learn complicated functions by using a large number of hidden layers, providing the network “depth”. Fewer layers may be capable of learning complex functions with the same number of parameters as “deep” models in certain cases. Deeper networks are not necessary for all applications [2]. The advantage of having multiple layers is that they can learn features at various levels of abstraction.

Objectives

Methods

Findings

Conclusion