Objective-based clustering is a class of important clustering analysis techniques; however, these methods are easily beset by local minima due to the non-convexity of their objective functions involved, as a result, impacting final clustering performance. Recently, a convex clustering method (CC) has been on the spot light and enjoys the global optimality and independence on the initialization. However, one of its downsides is non-robustness to data contaminated with outliers, leading to a deviation of the clustering results. In order to improve its robustness, in this paper, an outlier-aware robust convex clustering algorithm, called as RCC, is proposed. Specifically, RCC extends the CC by modeling the contaminated data as the sum of the clean data and the sparse outliers and then adding a Lasso-type regularization term to the objective of the CC to reflect the sparsity of outliers. In this way, RCC can both resist the outliers to great extent and still maintain the advantages of CC, including the convexity of the objective. Further we develop a block coordinate descent approach with the convergence guarantee and find that RCC can usually converge just in a few iterations. Finally, the effectiveness and robustness of RCC are empirically corroborated by numerical experiments on both synthetic and real datasets.
Read full abstract