Rethinking Genre Classification With Fine Grained Semantic Clustering

Edward Fish,Andrew Gilbert,Jon Weinbren

doi:10.1109/icip42928.2021.9506751

Abstract

Movie genre classification is an active research area in machine learning; however, the content of movies can vary widely within a single genre label. We expand these ‘coarse’ genre labels by identifying ‘fine-grained’ contextual relationships within the multi-modal content of videos. By leveraging pre-trained ‘expert’ networks, we learn the influence of different combinations of modes for multi-label genre classification. Then, we continue to fine-tune this ‘coarse’ genre classification network self-supervised to sub-divide the genres based on the multi-modal content of the videos. Our approach is demonstrated on a new multi-moda137,866,450 frame, 8,800 movie trailer dataset, MMX-Trailer-20, which includes pre-computed audio, location, motion, and image embeddings.

Full Text