FAUM: Fast Autonomous Unsupervised Multidimensional classification

Hugo Javier Curti,Rubén Sergio Wainschenker

doi:10.1016/j.ins.2018.06.008

Abstract

This article presents Faum: Fast Autonomous Unsupervised Multidimensional, an automatic clustering algorithm that can discover natural groupings in unlabeled data. Faum is aimed to optimize the resources provided by a modern computer to process big datasets. The present algorithm can find disjoint spherical symmetrical clusters in a deterministic way and without the indication of the number of clusters to find, iterations or initializations. Faum is remarkably fast compared to K-Means when a big multidimensional set of data is processed. Since Faum has an average O(N) space and time complexity, it can process datasets of several hundred megabytes size in less than a minute on a standard laptop computer. Furthermore, Faum is not sensitive to outliers and may be used by itself or to provide the whole initialization for a deterministic K-Meansprocessing.

Full Text