The tasks of feature-level opinion mining usually include the extraction of product entities from consumer reviews, the identification of opinion words that are associated with the entities, and the determining of these opinions’ polarities (e.g., positive, negative, or neutral). In recent years, two major approaches have been proposed to determine opinions at the feature level: model based methods such as the one based on lexicalized Hidden Markov Model (L-HMMs), and statistical methods like the association rule mining based technique. However, little work has compared these algorithms regarding their practical abilities in identifying various types of review elements, such as features, opinions, intensifiers, entity phrases and infrequent entities. On the other hand, little attentions has been paid to applying more discriminative learning models to accomplish these opinion mining tasks. In this paper, we not only experimentally compared these methods based on a real-world review dataset, but also in particular adopted the Conditional Random Fields (CRFs) model and evaluated its performance in comparison with related algorithms. Moreover, for CRFs-based mining algorithm, we tested the role of a self-tagging process in two automatic training conditions, and further identified the ideal combination of learning functions to optimize its learning performance. The comparative experiment eventually revealed the CRFs-based method’s outperforming accuracy in terms of mining multiple review elements, relative to other methods.
Read full abstract