Abstract

This paper presents a novel approach to automatically identify characters in films using audio visual cues and text analysis. The approach consists of three stages: (i) frontal face track detection and clustering, (ii) face track classification, (iii) name assignment. A Finite State Machine (FSM) method is utilized to filter faces detected on each frame and build face tracks. The face tracks are clustered using constrained K-Centers. The tracks located in the center area of each cluster are set as exemplars. The marginal points of each cluster and the newly detected non-frontal face tracks are classified to these exemplars using complementary cues of audio and visual. The names of characters are ranked based on their occurrences in the film script and the face track clusters are ranked based on track counts. The names are assigned to the clusters according to the ranking order. Experiments were conducted on two feature-length films and gave promising results.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call