Abstract

This research is done based on the identification and thorough analyzing musical data that is extracted by the various method. This extracted information can be utilized in the deep learning algorithm to identify the emotion, based on the hidden features of the dataset. Deep learning-based convolutional neural network (CNN) and long short-term memory-gated recurrent unit (LSTM-GRU) models were developed to predict the information from the musical information. The musical dataset is extracted using the fast Fourier transform (FFT) models. The three deep learning models were developed in this work the first model was based on the information of extracted information such as zero-crossing rate, and spectral roll-off. Another model was developed on the information of Mel frequencybased cepstral coefficient (MFCC) features, the deep and wide CNN algorithm with LSTM-GRU bidirectional model was developed. The third model was developed on the extracted information from Mel-spectrographs and untied these graphs based on two-dimensional (2D) data information to the 2D CNN model alongside LSTM models. Proposed model performance on the information from Mel-spectrographs is compared on the F1 score, precision, and classification report of the models. Which shows better accuracy with improved F1 and recall values as compared with existing approaches.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call