Abstract
The integration of artificial intelligence (AI) and the Internet of Things (IoT) has tremendous prospects in smart healthcare. The advancement of AI in the form of deep learning brought a revolution in automatic classification and detection systems. In addition, next-generation wireless communications such as 5G networking brought speed and the seamless transmission of data. With the convergence of these elements, the smart healthcare sector is currently booming. Particularly during the post-COVID-19 pandemic, the necessity of smart healthcare has come to light more than before. A significant number of people suffer from voice pathology. This pathology can be easily cured if detected early. In this study, a voice pathology detection system within a smart healthcare framework is proposed. The inputs are obtained by the IoT, namely microphones and electroglottography (EGG) devices to capture voice and EGG signals, respectively. Spectrograms are obtained from these signals and fed into a pretrained convolutional neural network (CNN). The features extracted from the CNN are fused and processed using a bi-directional long short-term memory network. The proposed system is evaluated using a publicly available database, called the Saarbruecken voice database. The experimental results show that bimodal input performs better than a single input. An accuracy of 95.65% is obtained for the proposed system.
Highlights
Owing to the excessive use of their voice, numerous individuals today suffer from voice pathologies
Many methods have been used in the area of automated voice pathology identification and classification, and we found that similar nonlinear methods were often employed
This is critical for locating the vocal cord vibration information because the process of EGG signal production represents the change in the contact surface during vocal cord movement. To arrive at their findings, the study by Wu et al [47] classified voice pathology identification as an image classification issue and used frequency domain transformations on time-domain sound data. This model was based on a short-time Fourier transform (STFT) approach and a convolutional neural network (CNN) network, which consisted of ten convolutional layers with a filter size of 8 × 8
Summary
Owing to the excessive use of their voice, numerous individuals today suffer from voice pathologies. Alhussein: Convergence of AI and IoT in Smart Healthcare: Case Study of Voice Pathology Detection speech (including frequency, quality, roughness, breathiness, exhaustion, and stress), the average grade of dysphonia, degree of roughness, breathiness, asthenia, and strain is assessed using the CAPE-V [4]. Because these assessment methods are often used in clinical practice, there are caveats for the analytical test. (i) A multi-modal VPD system, which utilizes voice and EGG signals, and is able to diagnose patients with dysarthria with increased accuracy and provide a better foundation for pathological voice identification.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.