Soft clustering

Maria Brigida Ferraro,Paolo Giordani

doi:10.1002/wics.1480

Abstract

AbstractClustering is one of the most used tools in data analysis. In the last decades, due to the increasing complexity of data, soft clustering has received a great deal of attention. There exist different approaches that can be considered as soft. The most known is the fuzzy approach that consists in assigning objects to clusters with membership degrees, depending on the dissimilarities between each object and all the prototypes, ranging in the unit interval. Closely related to the fuzzy approach, there is the possibilistic one that, differently from the previous one, relaxes some constraints on the membership degrees. In particular, the objects are assigned to clusters with degrees of typicalities, depending just on the dissimilarities between each object and the closest prototype. A further soft approach is the rough one. In this case, there are not degrees ranging between 0 and 1 but objects with intermediate features belong to the boundary region and are assigned to more than one cluster. Even if it is not universally recognized in the scientific community as an approach of soft clustering, from our point of view, the model‐based approach can also be considered as such. Model‐based clustering methods also produce a soft partition of the objects and the posterior probability of a component membership may play a role similar to the membership degree. The four approaches are critically described from a theoretical point of view and an empirical comparative analysis is carried out.This article is categorized under:Statistical Learning and Exploratory Methods of the Data Sciences > Clustering and ClassificationStatistical and Graphical Methods of Data Analysis > Multivariate AnalysisStatistical Learning and Exploratory Methods of the Data Sciences > Exploratory Data Analysis

Full Text