Abstract

The integration of artificial intelligence (AI) and the Internet of Things (IoT) has tremendous prospects in smart healthcare. The advancement of AI in the form of deep learning brought a revolution in automatic classification and detection systems. In addition, next-generation wireless communications such as 5G networking brought speed and the seamless transmission of data. With the convergence of these elements, the smart healthcare sector is currently booming. Particularly during the post-COVID-19 pandemic, the necessity of smart healthcare has come to light more than before. A significant number of people suffer from voice pathology. This pathology can be easily cured if detected early. In this study, a voice pathology detection system within a smart healthcare framework is proposed. The inputs are obtained by the IoT, namely microphones and electroglottography (EGG) devices to capture voice and EGG signals, respectively. Spectrograms are obtained from these signals and fed into a pretrained convolutional neural network (CNN). The features extracted from the CNN are fused and processed using a bi-directional long short-term memory network. The proposed system is evaluated using a publicly available database, called the Saarbruecken voice database. The experimental results show that bimodal input performs better than a single input. An accuracy of 95.65% is obtained for the proposed system.

Highlights

  • Owing to the excessive use of their voice, numerous individuals today suffer from voice pathologies

  • Many methods have been used in the area of automated voice pathology identification and classification, and we found that similar nonlinear methods were often employed

  • This is critical for locating the vocal cord vibration information because the process of EGG signal production represents the change in the contact surface during vocal cord movement. To arrive at their findings, the study by Wu et al [47] classified voice pathology identification as an image classification issue and used frequency domain transformations on time-domain sound data. This model was based on a short-time Fourier transform (STFT) approach and a convolutional neural network (CNN) network, which consisted of ten convolutional layers with a filter size of 8 × 8

Read more

Summary

INTRODUCTION

Owing to the excessive use of their voice, numerous individuals today suffer from voice pathologies. Alhussein: Convergence of AI and IoT in Smart Healthcare: Case Study of Voice Pathology Detection speech (including frequency, quality, roughness, breathiness, exhaustion, and stress), the average grade of dysphonia, degree of roughness, breathiness, asthenia, and strain is assessed using the CAPE-V [4]. Because these assessment methods are often used in clinical practice, there are caveats for the analytical test. (i) A multi-modal VPD system, which utilizes voice and EGG signals, and is able to diagnose patients with dysarthria with increased accuracy and provide a better foundation for pathological voice identification.

TYPES OF VOICE PATHOLOGIES
PROPOSED VPD SYSTEM
EXPERIMENTS
Findings
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call