Abstract

Abstract Semantic web mining can be defined as the problem of automatic extraction of information from the database and produce a structured output to the user queries. Most works in knowledge base system have focused on using axioms applying enrichment method, but was not efficient in detecting the document topic and hence resulted with the expanded queries. Although there is a straight forward approach for automatic extraction from a larger set of information, the problem of recognizing the second or higher order associations between the documents on the web still remains a challenging issue. Specifically, we present three folds architecture to process a class of well designed user queries based on Reuters-21578 Text Categorization Collection Data Set, called as Semantic Resource Description with Compound Protocol (SRD-CP). The first fold constructs a Semantic Pattern Tree based on the Resource Description Framework (RDF) query language for detecting the document topic name. The RDF query language uses compound structure to produce the result to the expanded user queries. The second fold is the design of compound structure in SRD-CP framework that consists of HTTP protocol and constrained application protocol to handle the complex query processing. Finally, the third fold includes the definition of word co-occurrences using the association frequency form with the help of the constructed SPT. Run Associate Apriori (RAA) algorithm is used to identify the second or higher order association between the web documents. The RAA algorithm in SRD-CP framework recognizes the association between web related documents and user fetched queries. SRD-CP framework is empirically evaluated using the dataset and is shown to be significantly more efficient in terms of document association level computation rate, processing time and user result retrieval rate.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.