Classification between Elderly Voices and Young Voices Using an Efficient Combination of Deep Learning Classifiers and Various Parameters

Ji-Yeoun Lee

doi:10.3390/app11219836

Abstract

The objective of this research was to develop deep learning classifiers and various parameters that provide an accurate and objective system for classifying elderly and young voice signals. This work focused on deep learning methods, such as feedforward neural network (FNN) and convolutional neural network (CNN), for the detection of elderly voice signals using mel-frequency cepstral coefficients (MFCCs) and linear prediction cepstrum coefficients (LPCCs), skewness, as well as kurtosis parameters. In total, 126 subjects (63 elderly and 63 young) were obtained from the Saarbruecken voice database. The highest performance of 93.75% appeared when the skewness was added to the MFCC and MFCC delta parameters, although the fusion of the skewness and kurtosis parameters had a positive effect on the overall accuracy of the classification. The results of this study also revealed that the performance of FNN was higher than that of CNN. Most parameters estimated from male data samples demonstrated good performance in terms of gender. Rather than using mixed female and male data, this work recommends the development of separate systems that represent the best performance through each optimized parameter using data from independent male and female samples.

Highlights

The human voice represents a complex biological signal resulting from the dynamic interaction between adduction/vibration of the vocal folds and pulmonary air emission and flow through the resonant structures [1]
In order to create a system for recognizing the voice of the elderly, it is necessary to understand the characteristics of changes in vocal cord tissue due to anatomical or physiological aging [10], and various welfare systems using only the voice database of the elderly should be implemented
This work focused on deep learning methods, such as feedforward neural network (FNN) and convolutional neural network (CNN), for the detection of elderly voice signals using mel-frequency cepstral coefficients (MFCCs) and linear prediction cepstrum coefficients (LPCCs), skewness, and kurtosis parameters

Summary

Introduction

The human voice represents a complex biological signal resulting from the dynamic interaction between adduction/vibration of the vocal folds and pulmonary air emission and flow through the resonant structures [1]. Physiologic aging leads to specific changes in the anatomy and physiology of all structures involved in the production and modulation of the human voice [2,3,4]. The aging of laryngeal tissue changes the movement of the vocal cords, their vibration, and their opening and closing processes [5]. Voice characteristics are measured by the frequency of vocal cord oscillations per second, that is, the fundamental frequency (F0), jitter, shimmer, excitation source component, etc. In order to create a system for recognizing the voice of the elderly, it is necessary to understand the characteristics of changes in vocal cord tissue due to anatomical or physiological aging [10], and various welfare systems using only the voice database of the elderly should be implemented

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Classification between Elderly Voices and Young Voices Using an Efficient Combination of Deep Learning Classifiers and Various Parameters

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Journal: Applied Sciences	Publication Date: Oct 21, 2021
License type: CC BY 4.0

Similar Papers

Experimental Evaluation of Deep Learning Methods for an Intelligent Pathological Voice Detection System Using the Saarbruecken Voice Database
Ji-Yeoun Lee
Applied Sciences | VOL. 11
Ji-Yeoun LeeJi-Yeoun Lee
02 Aug 2021
Applied Sciences | VOL. 11

Comparative Study between Healthy Young and Elderly Subjects: Higher-Order Statistical Parameters as Indices of Vocal Aging and Sex
Hee-Jin Choi ... Ji-Yeoun Lee
Applied Sciences | VOL. 11
Hee-Jin Choi, et. al.Hee-Jin Choi ... Ji-Yeoun Lee
28 Jul 2021
Applied Sciences | VOL. 11

An Efficient SMOTE-Based Deep Learning Model for Voice Pathology Detection
Ji-Na Lee ... Ji-Yeoun Lee
Applied Sciences | VOL. 13
Ji-Na Lee, et. al.Ji-Na Lee ... Ji-Yeoun Lee
10 Mar 2023
Applied Sciences | VOL. 13

Effectiveness of Self Normalizing Neural Networks for Text Classification
Avinash Madasu ... Vijjini Anvesh Rao
-
Avinash Madasu, et. al.Avinash Madasu ... Vijjini Anvesh Rao
01 Jan 2023
01 Jan 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Classification between Elderly Voices and Young Voices Using an Efficient Combination of Deep Learning Classifiers and Various Parameters

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences