Abstract

This paper investigates and compares methods for cognitive load (CL) estimation from speech. The majority of previous studies of CL estimation used speech collected in laboratory conditions and conventional speech classification methods. Traditionally laboratory speech contains balanced classes that are labeled by a third party after the speech has been collected. In contrast, the speech used in this research was recorded during an experiment focused on human-machine interaction - where spoken commands were used to control simulated aircraft. The speech was labeled using subjective assessments of CL during an experiment that manipulated workload. Current state-of-the-art Convolutional Neural Network (CNN) classification was used for cognitive load estimation and was compared with conventional Support Vector Machine (SVM) and k-Nearest Neighbor (k-NN) classification. Different speaker-dependence models were compared across 2 and 3 classes. In addition, class boundary selection was optimized to reflect the subjective human workload response sigmoidal curve and compared with linear class boundaries. Results for 3-class CL estimation showed that CNN classifiers trained using speech spectrograms for Partially Speaker Dependent (PSD) models using sigmoidal curve class boundaries provided up to 83.7% accuracy. CNN classifiers outperformed baseline SVM and k-NN classifiers (that used acoustic features) on the same dataset by 13.2% and 10.5% respectively. These outcomes indicate that spectrogram-trained CNN classifiers are a worthy consideration in paralinguistic classification problems.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.