Abstract

Predicting the emotions evoked by generalized sound events is a relatively recent research domain which still needs attention. In this work a framework aiming to reveal potential similarities existing during the perception of emotions evoked by sound events and songs is presented. To this end the following are proposed: (a) the usage of temporal modulation features, (b) a transfer learning module based on an echo state network, and (c) a k-medoids clustering algorithm predicting valence and arousal measurements associated with generalized sound events. The effectiveness of the proposed solution is demonstrated after a thoroughly designed experimental phase employing both sound and music data. The results demonstrate the importance of transfer learning in the specific field and encourage further research on approaches which manage the problem in a synergistic way.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call