Development of Fuzzy C-Means with Fuzzy Chebyshev for genomic clustering solutions addressing cancer issues

Nurnadiah Zamri,Nor Azmi Abu Bakar,Chong Siew Koon,Elissa Nadia Madi,Sukono Mm.M Si,Azim Zaliha Abd Aziz,Ras Azira Ramli

doi:10.1016/j.procs.2024.05.182

Abstract

Clustering is a crucial technique used to identify the structure of data sets. It involves the task of splitting groups into different clusters based on suitable similarity measures. Fuzzy C-Means is a clustering technique that extends the fuzzy set theory. It is a widely used method among existing fuzzy clustering algorithms due to its easy implementation and straightforward approach. However, it still has some drawbacks, such as high sensitivity to outliers or noisy data, as well as to initialization conditions. Therefore, this paper proposes a modification to the Fuzzy C-Means algorithm using Fuzzy Chebyshev. The preprocessing of raw datasets includes the use of MinMax Scaler. Additionally, dimensional reduction techniques are applied to effectively address issues related to high-dimensional data. To determine the optimal number of clusters in the data set, the Elbow method is implemented. The final clustering is achieved by employing Fuzzy Chebyshev to replace the distance measurement component in the Fuzzy C-Means algorithm. A comparison between the proposed method and previous methods is performed. The effectiveness of the proposed method is illustrated through a numerical example involving genomic data of prostate cancer. Comparable results are presented to validate the feasibility of the proposed method. The results demonstrate that the proposed method aligns with those obtained from previous methods.

Full Text