Abstract

A Film Synopsis Genre Classifier Based on Majority Vote

Highlights

  • Automatic classification of text is known to be a difficult task for a computer

  • We would like to precise that for this experiment, all textual information were taken from the english version of Wikipedia, the only language conflict we may have could be the translations of the synopsis into english

  • The movie signature is compared to the true genres, and the average accuracy is computed in the following way where MS represents the movie signature, GT the ground truth, N the number of test synopsis, lenSum the true positive predictions for a given synopsis, and SumttT the total amount of genres expected across the test set

Read more

Summary

A BST R A C T

We propose an automatic classification system of movie genres based on different features from their textual synopsis. Our system is first trained on thousands of movie synopsis from online open databases, by learning relationships between textual signatures and movie genres. It is tested on other movie synopsis, and its results are compared to the true genres obtained from the Wikipedia and the Open Movie Database (OMDB) databases. The results show that our algorithm achieves a classification accuracy exceeding 75%.

INTRODUCTION
RELATED WORK
Term frequencies comparison
Term frequency
Term frequency inverse document frequency
Sets comparison
Comparison of the angle between vectors
Cosine similarity
String comparison
Dice similarity
DATASETS
SIMILARITY MEASURES
RESULTS
CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.