Abstract

Multi-view clustering, which optimally integrates complementary information from different views to improve clustering performance, has drawn considerable attention in recent years. Despite recent advances, issues remain when dealing with data of high dimensionality and heterogeneity, especially in categorical sequences. These unique challenges and properties have motivated us to develop a novel Multi-view Kernel Clustering framework for Categorical sequences (MKCC), where views are expressed in terms of kernel matrices and a weighted combination of the instances is learned in parallel to the partitioning. Concretely, MKCC adaptively constructs the kernel matrix without the need of defining the kernel function. Nonetheless, the computational cost of storing the kernel matrix is O(N2). To address this issue, we integrate a simple and efficient method for approximating the kernel matrix. A new multi-view clustering algorithm and a cluster validity index for categorical sequences are also proposed based on the framework. An empirical analysis on synthetic data sets and several commonly used real-world data sets demonstrates the appropriateness of the proposal, with the results showing the method’s outstanding performance.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call