Abstract

Voice changes may be the earliest signs in laryngeal cancer. We investigated whether automated voice signal analysis can be used to distinguish patients with laryngeal cancer from healthy subjects. We extracted features using the software package for speech analysis in phonetics (PRAAT) and calculated the Mel-frequency cepstral coefficients (MFCCs) from voice samples of a vowel sound of /a:/. The proposed method was tested with six algorithms: support vector machine (SVM), extreme gradient boosting (XGBoost), light gradient boosted machine (LGBM), artificial neural network (ANN), one-dimensional convolutional neural network (1D-CNN) and two-dimensional convolutional neural network (2D-CNN). Their performances were evaluated in terms of accuracy, sensitivity, and specificity. The result was compared with human performance. A total of four volunteers, two of whom were trained laryngologists, rated the same files. The 1D-CNN showed the highest accuracy of 85% and sensitivity and sensitivity and specificity levels of 78% and 93%. The two laryngologists achieved accuracy of 69.9% but sensitivity levels of 44%. Automated analysis of voice signals could differentiate subjects with laryngeal cancer from those of healthy subjects with higher diagnostic properties than those performed by the four volunteers.

Highlights

  • Laryngeal cancer is one of the most debilitating forms of malignancy, with an average incidence of 3.3 per 100,000 from 2012 to 2016 in the USA [1]

  • We examined the normal and cancer voice signal classification and tested the performance of the support vector machine (SVM), XGBoost, light gradient boosted machine (LightGBM), artificial neural network (ANN), one-dimensional convolutional neural network (1D-CNN), and 2D-CNN

  • We identified a total of 50 laryngeal cancer patients who had undergone voice analysis preoperatively

Read more

Summary

Introduction

Laryngeal cancer is one of the most debilitating forms of malignancy, with an average incidence of 3.3 per 100,000 from 2012 to 2016 in the USA [1]. In 2019, there were 12,370 new cases diagnosed in the USA alone. Early diagnosis remains challenging, resulting in delayed treatment [2,3]. With a delay of diagnosis, laryngeal cancer may lead to the most severe debilitating disabilities in phonation, swallowing [4] and overall quality of life. An automated voice analysis tool could advance the time of diagnosis regardless of patients’ location, in line with the idea of telemedicine. Though voice changes can indicate the first clinical signs of disease, subjective perception of early voice changes can be listener dependent and subject to intrajudge variations [5]

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call