Abstract

Semi-supervised (or partial) fuzzy clustering plays an important and unique role in discovering hidden structure in data realized in presence of a certain quite limited fraction of labeled patterns. The objective of this study is to investigate and quantify the effect of various distance functions (distances) on the performance of the clustering mechanisms. The underlying goal of endowing the clustering algorithms with a higher level of flexibility is done via the use of various distances. The enhancement of this character is evaluated by means of a comprehensive assessment of quality of clusters, their ensuing discrimination abilities and the accuracy of clusters themselves. In addition to the standard Euclidean distance being commonly exploited in fuzzy clustering, three more versatile and adaptive distance measures are considered such as its weighted version, a full adaptive distance, and a kernel-based distance. Using Fuzzy C-Means (FCM) coming in its generic format, we show its semi-supervised enhancements, derive detailed formulas and analyze their effectiveness. The improvements of semi-supervised clustering are empirically evaluated and numerically quantified with the use of several Machine Learning data sets.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.