Abstract
With the growing information on web, online movie review is becoming a significant information resource for Internet users. However, online users post thousands of movie reviews on daily basis and it is hard for them to manually summarize the reviews. Movie review mining and summarization is one of the challenging tasks in natural language processing. Therefore, an automatic approach is desirable to summarize the lengthy movie reviews, and it will allow users to quickly recognize the positive and negative aspects of a movie. This study employs a feature extraction technique called bag of words (BoW) to extract features from movie reviews and represent the reviews as a vector space model or feature vector. The next phase uses Naïve Bayes machine learning algorithm to classify the movie reviews (represented as feature vector) into positive and negative. Next, an undirected weighted graph is constructed from the pairwise semantic similarities between classified review sentences in such a way that the graph nodes represent review sentences, while the edges of graph indicate semantic similarity weight. The weighted graph-based ranking algorithm (WGRA) is applied to compute the rank score for each review sentence in the graph. Finally, the top ranked sentences (graph nodes) are chosen based on highest rank scores to produce the extractive summary. Experimental results reveal that the proposed approach is superior to other state-of-the-art approaches.
Highlights
With the development of Web 2.0 that emphasizes the participation of users, more and more websites such as Internet Movie Database (IMBD, a movie review website) and Amazon encourage users to post reviews for the products they are interested in
The previous approaches proposed for movie summarization are limited to generate feature-based summary rather than generic summary. erefore, this study proposes a review mining and summarization (RMS) approach that integrates supervised machine learning (ML) approach with graph-based ranking algorithm to automatically generate a generic summary of movie reviews. e proposed approach operates in the following manner: first, we employ a simple feature extraction technique called bag of words (BoW) to extract features from movie reviews and represent them as a vector space model or feature vector. e phase uses Naıve Bayes classifier to classify the movie reviews into positive and negative
Evaluation Data. e proposed approach comprises of two components: the first component is Naıve Bayes (NB) classifier, which classifies the review documents into positive and negative. e second component is semantic graphbased ranking algorithm, which performs the task of movie review summarization
Summary
With the development of Web 2.0 that emphasizes the participation of users, more and more websites such as Internet Movie Database (IMBD, a movie review website) and Amazon encourage users to post reviews for the products they are interested in. E number of reviews received by a product grows rapidly as millions of customers post reviews about a product, which results in information overload [1]. Due to this information overload, it is difficult for a customer to scan each review of a product in order to make a decision whether to purchase a product or not. The summary of movie reviews can assist the movie service provider such as Netflix to swiftly understand the watching patterns or the interests of their customers
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.