Abstract
This paper focuses on the binary classification of the emotion of fear, based on the physiological data and subjective responses stored in the DEAP dataset. We performed a mapping between the discrete and dimensional emotional information considering the participants’ ratings and extracted a substantial set of 40 types of features from the physiological data, which represented the input to various machine learning algorithms—Decision Trees, k-Nearest Neighbors, Support Vector Machine and artificial networks—accompanied by dimensionality reduction, feature selection and the tuning of the most relevant hyperparameters, boosting classification accuracy. The methodology we approached included tackling different situations, such as resolving the problem of having an imbalanced dataset through data augmentation, reducing overfitting, computing various metrics in order to obtain the most reliable classification scores and applying the Local Interpretable Model-Agnostic Explanations method for interpretation and for explaining predictions in a human-understandable manner. The results show that fear can be predicted very well (accuracies ranging from 91.7% using Gradient Boosting Trees to 93.5% using dimensionality reduction and Support Vector Machine) by extracting the most relevant features from the physiological data and by searching for the best parameters which maximize the machine learning algorithms’ classification scores.
Highlights
IntroductionAs there is a broad interest in the field of affective computing and affect recognition, this study aims to explore fear classification based on extracted time-related, frequency-related and events-related features from a well-known dataset containing physiological recordings (electrodermal activity—EDA and heart rate variability—HRV) and self-reported ratings of valence, arousal and dominance
As there is a broad interest in the field of affective computing and affect recognition, this study aims to explore fear classification based on extracted time-related, frequency-related and events-related features from a well-known dataset containing physiological recordings and self-reported ratings of valence, arousal and dominance
The crossvalidation score tends to converge to the training score, above a desired performance of 80%, which implies that there is no situation of overfitting
Summary
As there is a broad interest in the field of affective computing and affect recognition, this study aims to explore fear classification based on extracted time-related, frequency-related and events-related features from a well-known dataset containing physiological recordings (electrodermal activity—EDA and heart rate variability—HRV) and self-reported ratings of valence, arousal and dominance. Decision Trees (DT), which are intuitive, transparent to inspection and easy to validate, k-Nearest Neighbors (kNN), Support Vector Machine (SVM) and artificial networks—and tackled various situations, such as dealing with an imbalanced dataset via data augmentation, preventing overfitting through cross-validation and by plotting learning curves, reducing dimensionality, selecting the most relevant features and tuning the appropriate hyperparameters in order to obtain the highest classification scores. Minority Oversampling Technique (SMOTE), which generated new samples from the minority class, based on the closest examples in the feature space. Together with the undersampling of the majority class, this method helped to resolve the issue of working on an imbalanced dataset
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have