AbstractClustering is one of the most widely used unsupervised learning techniques. However, it is well‐known that outliers can have a significantly adverse impact on commonly applied clustering methods. On the other hand, clustered outliers can be particularly detrimental to (even robust) statistical procedures. Therefore, it makes sense to combine concepts from Robust Statistics and Cluster Analysis to deal with both clusters and outliers simultaneously through robust clustering approaches. Among the existing robust clustering techniques, we focus on those that rely on (impartial) trimming. Trimming offers the user an easy interpretation, as standard well‐known clustering methods are applied after a fraction of the potentially most outlying observations is removed. This trimming approach, when combined with appropriate constraints on the clusters' dispersion parameters, has shown a good performance and can be implemented efficiently thorough available algorithms.This article is categorized under: Statistical Learning and Exploratory Methods of the Data Sciences > Clustering and Classification Statistical and Graphical Methods of Data Analysis > Robust Methods
Read full abstract