Extracting Knowledge from XML Document Using Tree-Based Association Rules

S Thangarasu,D Sasikala

doi:10.1109/icica.2014.37

Abstract

The usage of eXtensible Markup Language (XML) is increased to represent and exchange the web data over internet. Because of the popularity of XML and with increasing nature of XML documents, the necessity of extracting knowledge from this type of data has found more attention. Data mining is widely applied in the database research area in order to extract frequent correlations of values from both structured and semi structured datasets. Although various methods have been proposed for mining XML documents, the research field is still immature compared to traditional data mining. Association rule mining is an appropriate technique for extracting knowledge from these XML datasets. The classical model of association rules mining is based on support and confidence measure. The goal is to experimentally evaluate association rule mining approaches in the context of XML databases. In this work, an approach is proposed to mine XML documents using Tree-based Association Rules. Result of this approach is in the form of rules which contains structure and content information of XML documents, such rules can be stored in XML format to be queried later on. The mined knowledge is fairly accurate, intentional knowledge used to provide quick answers to queries.

Full Text