Abstract

Nowadays, clustering the real-life data using attribute information alone is not enough because the data come from various resources and may have data object relations. Thus, it enforces to include the relation information of the data while clustering. Knowing its importance, many researchers have used it along with attribute information. The Integrated K-means Laplacian (IKL) algorithm is one such type that integrates the attribute and pair-wise relations to cluster the data. It is well known for its way of clustering the data. However, it has issues in the creation of the normalized Laplacian matrix. The current study proposes three different ways of creating the normalized Laplacian matrix to rectify those issues. Based on these modifications, three new variants of the IKL algorithm are produced. Besides, the pair-wise similarity matrix (W) is another crucial element in the IKL algorithm. Earlier, the Gaussian function was used to create W in IKL, whereas this study proposes 12 different kernel functions to form W instead. Their influences on the existing and proposed algorithms’ performance are studied. Nine benchmark datasets are used to demonstrate the same. Further, the performances of proposed algorithms are compared with existing algorithms in recent literature by using the seven clustering evaluation metrics and running time of algorithms. The comparison studies reveal that the proposed modifications to the IKL algorithm are significant, and the statistical tests prove the same. Besides, an analysis is carried out by replacing the XX’ matrix with kernel functions, and the improvements in the performances are studied.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.