Abstract

The rich emotion part of a drama video is often the center of attraction to the viewer. Emotion-based highlights extraction is useful for applications such as drama video retrieval and automatic trailer generation. In this paper, we propose a system that uses music emotion and human face as features for automatic extraction of the emotion highlights of a drama video. These high-level audiovisual features are used because music invokes emotion response from the viewer and characters express emotion on their faces. To avoid the interference of speech signal and environmental noise, a novel two-stage music emotion recognition scheme is developed. We first detect the presence of incidental music in a drama video using an audio fingerprint technique, and then perform emotion recognition on the noise-free music available from the album of the incidental music. This simple but effective approach greatly improves the accuracy of music emotion recognition. Besides the conventional subjective evaluation, we propose a new metric for quantitative performance evaluation of highlights extraction. Evaluation results are provided to illustrate the performance of the system.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call