A New Intelligent Topic Extraction Model on Web

Ming Xie,Yunlu Zhang,Chanle Wu

doi:10.4304/jcp.6.3.466-473

Abstract

We tackle the problem of topic extraction on Web. In this paper, we propose an approach to implementing ontology-based data access in WordNet with the distinguishing feature of optimizing density-based clustering OPTICS algorithm (DBCO) to extract topics. Our solution has the following two desirable properties: i) it uses WordNet for word sense disambiguation of words in the learning resources documents and ii) it mapping the data space of the original method to a vector space of sentence, improving the original OPTICS algorithm. We outline the interface between our scheme and the current data Web, and show that, in contrast to the existing approaches, no exponential blowup is produced by the DBCO. Based on the experiments with a number of real-world data sets of 310 users in three study sites, we demonstrate that topic extraction in the proposed approach is efficient, especially for large-scale web learning resources. According to the user ratings data of four learning sites in the 150 days, the average rate of increase of user rating after the system is used reaches 25.18%.

Full Text