Abstract

Fuzzy c-means (FCM) is a kind of classic cluster method, which has been widely used in various fields, such as image segmentation and data mining. Euclidean distance is a frequently used distance metric in FCM, but it is only suitable for data with spherical structure. As an extension of Euclidean distance, Mahalanobis distance has been used in Gustafson-Kessel (GK) FCM and its variants to tackle ellipsoidal data. For the convenience of optimizing, most existing Mahalanobis distance based FCM algorithms only focus on squared Mahalanobis distance. However, squared Mahalanobis distance may be not the best distance metric for FCM, because it is easy to enlarge the influence of outliers. In this paper, we propose a novel <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\ell _{2,p}$</tex-math></inline-formula> -norm and Mahalanobis distance based FCM model, abbreviated as LM-FCM, which can help FCM improve the ability of tackling ellipsoidal clusters and outliers. Then, in order to reduce computational complexity we propose a more simplified model, abbreviated as SLM-FCM. Furthermore, we develop an iteratively re-weighted optimization algorithm to optimize the proposed models and provide a rigorous monotonous convergence proof. Finally, compared with existing state-of-the-art FCM algorithms, we conduct extensive experiments on both synthetic and real-world data sets to manifest the superior clustering performance and robustness of the proposed algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call