Movie genre prediction based on the bidirectional encoder representations from transformer

Lu Li,Tongyu Li,Jin Lin

doi:10.54254/2755-2721/47/20241383

Abstract

The rapid expansion of digital media has underscored the growing significance of predicting genres to target audiences effectively and to enhance filmmakers' understanding of viewer preferences. In this study, we introduced a novel method for forecasting movie genres, leveraging the power of the Bidirectional Encoder Representations from Transformer (BERT) deep learning model. The research team employed a dataset sourced from Douban's website, which featured 5,000 movies, complete with cover images, titles, and genre information. This undertaking tackled several key challenges, including the extraction of features from textual data, the categorization of movie genres, and the incorporation of cultural nuances. BERT's bidirectional representation, especially its variant tailored for Chinese language tasks, 'bert-base-chinese', was employed to extract textual features. Meanwhile, the visual features from movie covers were processed using the Wide ResNet-50-2 architecture. The combined features underwent classification, and the resulting model achieved an accuracy of 34.67%, a recall rate of 74.5%, and an F1 score of 0.4755. The study validates the potential of the BERT model in predicting movie genres and offers significant insights for future research in multimedia content classification.

Full Text