The vocalization of infants, commonly known as baby crying, represents one of the primary means by which infants effectively communicate their needs and emotional states to adults. While the act of crying can yield crucial insights into the well-being and comfort of a baby, there exists a dearth of research specifically investigating the influence of the audio range within a baby cry on research outcomes. The core problem of research is the lack of research on the influence of audio range on baby cry classification on machine learning. The purpose of this study is to ascertain the impact of the duration of an infant’s cry on the outcomes of machine learning classification and to gain knowledge regarding the accuracy of results F1 score obtained through the utilization of the machine learning method. The contribution is to enrich an understanding of the application of classification and feature selection in audio datasets, particulary in the context of baby cry audio. The utilized dataset, known as donate-a-cry-corpus, encompasses five distinct data classes and possesses a duration of seven seconds. The employed methodology consists of the spectrogram technique, cross-validation for data partitioning, MFCC feature extraction with 10, 20, and 30 coefficients, as well as machine learning models including Support Vector Machine, Random Forest, and Naïve Bayes. The findings of this study reveal that the Random Forest model achieved an accuracy of 0.844 and an F1 score of 0.773 when 10 MFCC coefficients were utilized and the optimal audio range was set at six seconds. Furthermore, the Support Vector Machine model with an RBF kernel yielded an accuracy of 0.836 and an F1 score of 0.761, while the Naïve Bayes model achieved an accuracy 0.538 and F1 score of 0.539. Notably, no discernible differences were observed when evaluating the Support Vector Machine and Naïve Bayes methods across the 1-7 second time trial. The implication of this research is to establish a foundation for the advancement of premature illness identification techniques grounded in the vocalizations of infants, thereby facilitating swifter diagnostic processes for pediatric practitioners.
Read full abstract