The effect of frame size on speech emotion recognition

D.S Suvorov,I.V Afanasieva,K.G Onyshchenko,O.V Kalynychenko

doi:10.30837/bi.2023.1(99).06

The effect of frame size on speech emotion recognition

D.S Suvorov, I.V Afanasieva + Show 2 more

https://doi.org/10.30837/bi.2023.1(99).06

Copy DOI

Journal: Bionics of Intelligence

Publication Date: Dec 29, 2023

#Speech Emotion Recognition Task #Speech Emotion Recognition + Show 8 more

Abstract
Full-Text
Similar Papers

Abstract

Speech emotion recognition task, as well as most audio recognition machine learning tasks, uses the so-called framing. This is the process of dividing the original audio signal into frames of a certain size, each of which is processed separately. This article presents a comparison of the effect of frame size on the emotion recognition result using a CNN network as an example. For the experiments, the CREMA-D dataset was used with the augmentations using noise adding, time stretching, and pitch shifting. We managed to achieve a recognition accuracy of 98.8% using dynamic frame size.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

More From: Bionics of Intelligence

Paper Title

Journal

Date

Author

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.