Abstract

Processing generalized sound events with the purpose of predicting the emotion they might evoke is a relatively young research field. Tools, datasets, and methodologies to address such a challenging task are still under development, far from any standardized format. This work aims to cover this gap by revealing and exploiting potential similarities existing during the perception of emotions evoked by sound events and music. o this end we propose (a) the usage of temporal modulation features and (b) a transfer learning module based on an Echo State Network assisting the prediction of valence and arousal measurements associated with generalized sound events. The effectiveness of the proposed transfer learning solution is demonstrated after a thoroughly designed experimental phase employing both sound and music data. The results demonstrate the importance of transfer learning in the specific field and encourage further research on approaches which manage the problem in a cooperative way.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.