Abstract
Accurately clustering large, high dimensional datasets is a challenging problem in unsupervised learning. K-means is considered to be a fast, widely used and accurate centroid based data partitioning algorithm for spherical datasets. However, its non-determinism and heavy dependence on the selection of initial cluster centers along with vulnerability to noise make it a poor candidate for clustering large datasets with high dimensionality. To overcome these, we develop a novel, nature inspired, centroid based clustering algorithm, inspired from the principles of particle physics. Our method ensures that the convergence to local optima and non-deterministic outputs are avoided. We experiment the method on large datasets of human face images. Besides, our method addresses the problem of outliers and presence of not well-separated data in these datasets. We use a deep learning model for extracting facial features into a vector of 128 dimensions. We validate the quality and accuracy of our methods using different statistical parameters like <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">f-measure, accuracy, error rate, average in group proportion and normalized cluster size rand index</i> . These evaluations show that our method exhibits better accuracy and quality in clustering large face image datasets, in comparison with other existing mechanisms. The strength of our algorithms is more visible as the size of the dataset grows.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.