Abstract

Today, more and more product reviews become available on the Internet, e.g., product review forums, discussion groups, and Blogs. However, it is almost impossible for a customer to read all of the different and possibly even contradictory opinions and make an informed decision. Therefore, mining online reviews (opinion mining) has emerged as an interesting new research direction. Extracting aspects and the corresponding ratings is an important challenge in opinion mining. An aspect is an attribute or component of a product, e.g. 'screen' for a digital camera. It is common that reviewers use different words to describe an aspect (e.g. 'LCD', 'display', 'screen'). A rating is an intended interpretation of the user satisfaction in terms of numerical values. Reviewers usually express the rating of an aspect by a set of sentiments, e.g. 'blurry screen'. In this paper we present three probabilistic graphical models which aim to extract aspects and corresponding ratings of products from online reviews. The first two models extend standard PLSI and LDA to generate a rated aspect summary of product reviews. As our main contribution, we introduce Interdependent Latent Dirichlet Allocation (ILDA) model. This model is more natural for our task since the underlying probabilistic assumptions (interdependency between aspects and ratings) are appropriate for our problem domain. We conduct experiments on a real life dataset, Epinions.com, demonstrating the improved effectiveness of the ILDA model in terms of the likelihood of a held-out test set, and the accuracy of aspects and aspect ratings.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.