Abstract

Although a user’s opinion, or a live voice, is very useful information for text mining of the business, it is difficult to extract popularity and unpopularity impressions of users from texts written in natural language. The popularity and unpopularity impressions discussed here depend on user’s claims, interests and demands. This paper presents a method of determining these impressions in commodity review sentences. Multi-attribute rule is introduced to extract the impressions from sentences, and four-stage-rules are defined in order to evaluate popularity and unpopularity impressions step by step. A deterministic multi-attribute pattern matching algorithm is utilized to determine the impressions efficiently. From simulation results for 2,240 review comments, it is verified that the multi-attribute pattern matching algorithm is 44.5 times faster than the Aho and Corasick method. The precision and recall of extracted impressions for each commodity are 94% and 93%. Moreover, the precision and recall of the resulting impressions for each rule are 95% and 95%, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call