Abstract
Nowadays, the representation of many real word problems needs to use some type of relational model. As a consequence, information used by a wide range of systems has been stored in multi relational tables. However, from a data mining point of view, it has been a problem, since most of the traditional data mining algorithms have not been originally proposed to handle this type of data without discarding relationship information. Aiming to ameliorate this problem, we propose a hierarchical approach for handling relational data. In this approach the relational data is converted into a hierarchical structure (the main table as the root and the relations as the nodes). This hierarchical way to represent relational data can be used either for classification or clustering purposes. In this paper, we will use it in clustering algorithms. In order to do so, we propose a hierarchical distance metric to compute the similarity between the tables. In the empirical analysis, we will apply the proposed approach in two well-known clustering algorithms (k-means and agglomerative hierarchical). Finally, this paper also compares the effectiveness of our approach with one existing relational approach.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.