Abstract
Gaussian mixture model (GMM) clustering has been extensively studied due to its effectiveness and efficiency. Though demonstrating promising performance in various applications, it cannot effectively address the absent features among data, which is not uncommon in practical applications. In this article, different from existing approaches that first impute the absence and then perform GMM clustering tasks on the imputed data, we propose to integrate the imputation and GMM clustering into a unified learning procedure. Specifically, the missing data is filled by the result of GMM clustering, and the imputed data is then taken for GMM clustering. These two steps alternatively negotiate with each other to achieve optimum. By this way, the imputed data can best serve for GMM clustering. A two-step alternative algorithm with proved convergence is carefully designed to solve the resultant optimization problem. Extensive experiments have been conducted on eight UCI benchmark datasets, and the results have validated the effectiveness of the proposed algorithm.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have