Abstract
The film industry is one of the impacts of the rapid development of technology. This causes the film industry to increase every year. In addition, technological developments also affect the public to make it easier to access various movies from various websites. With many choices of movies, people need to know the quality of various movies by knowing the reviews of these movies from other people. However, the large number of audience reviews of a movie makes it difficult for people to categorize good movies and bad movies. The solution to the problem is to perform sentiment analysis on movie reviews. In this research, the classification method used is Modified Balanced Random Forest. This method was chosen because it can overcome imbalanced data and can increase accuracy and reduce time complexity. In this research, Word2Vec is also used as feature extraction. This feature extraction was chosen because previous research explained that Word2Vec has the advantage of being able to show the contextual similarity of two words in the resulting vector. The best model produced from this research is a model built without using stemming in the preprocessing stage, using 300 dimensions in Word2Vec, and using the Modified Balanced Random Forest classification method which produces an f1-score of 84.15%.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Building of Informatics, Technology and Science (BITS)
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.