Abstract

In this paper, a new blogger’s interest mining module is proposed, which is based on Chinese text classification. In fact, the problem of the interest mining is transformed into the problem of Chinese text categorization. Before the Chinese text categorization, the text is pre-processed for the text representation. The Chinese text is represented in vector space model and classified by support vector machine classification, while filter algorithm which filters the unrelated interest text is proposed. After the filtering, the text can get it’s interest category. Finally the new module has been made use of to carry out an interest mining experiment, and the other experiment which has not filter algorithm is also carried in order to compare with the new module. The two experimental results show that the support vector machine is a effective algorithm, and the comparing data of the two experiments shows that new module make the interest mining more effective.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.