Abstract

Nowadays, we can observe the applications of machine learning in every field, ranging from the quality testing of materials to the building of powerful computer vision tools. One such recent application is the recommendation system, which is a method that suggests products to users based on their preferences. In this paper, our focus is on a specific recommendation system called movie recommendation. Here, we make use of user reviews of movies in order to establish a general outlook about the movie and then use that outlook to recommend that movie to other users. However, a huge number of available reviews has baffled sophisticated review systems. Consequently, there is a need to find a method of extracting meaningful information from the available reviews and use that in classifying a movie review and predicting the sentiment in each one. In a typical scenario, a review can either be positive, negative, or indifferent about a movie. However, the available research articles in the field mainly consider this as a two-class classification problem—positive and negative. The most popular work in this field was performed on Stanford and Rotten Tomatoes datasets, which are somewhat outdated. Our work is based on self-scraped reviews from the IMDB website, and we have annotated the reviews into one of the three classes—positive, negative, and neutral. Our dataset is called JUMRv1—Jadavpur University Movie Recommendation dataset version 1. For the evaluation of JUMRv1, we took an exhaustive approach by testing various combinations of word embeddings, feature selection methods, and classifiers. We also analysed the performance trends, if there were any, and attempted to explain them. Our work sets a benchmark for movie recommendation systems that is based on the newly developed dataset using a three-class sentiment classification.

Highlights

  • Because of the psychological nature of reviews, sentiment analysis (SA) of movie reviews is a challenging task for the researchers

  • Researchers, was trained on the entire Wikipedia corpus. It was used as a stand-alone with all 200 of its available features and along with different feature selection methods, which were utilised to rank the importance of the features, employing 150, 100, and 50 of these in the experiments

  • We studied the problem of movie recommendation systems, where we considered online movie reviews in order to suggest movies to people

Read more

Summary

Introduction

Because of the psychological nature of reviews, sentiment analysis (SA) of movie reviews is a challenging task for the researchers. SA is the process of manipulating textual media and extracting the subjective value from the text. It determines the review author’s attitude towards a movie: whether it is positive, negative, or indifferent. SA is currently being used all over the internet for various purposes such as political profiling, recommendation engines, fact checking, spam filtering, etc. It has rapidly generated a lot of attention among researchers working with machine learning and Natural Language. With the advent of social media, the amount of data on the internet has boomed Be it reviews, tweets, comments, poetry, stories, articles, or blogs, these resources can be tapped into and utilised by users. We used various methods to process and classify these movie reviews into one of the three classes—positive, negative, or neutral

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.