Abstract

Accordingly to Science Daily, 90 percent of all the data in the world has been generated in the last two years. However, the world is analyzing less than 1 percent of its data so far. With the advancement of high-performance computing, deep learning methods are readily applied to analyze large-scale high dimensional datasets. These machine learning methods have achieved significantly efficient training and inferencing as well as producing much more accurate predicted results. Clustering is an unsupervised machine learning method of identifying and grouping similar data points into the same cluster. Clustering plays a fundamental role in the data mining and machine learning community for grouping data into structures so that similar data points are assigned to similar groups. Furthermore, to process these huge amounts of high-dimensional data, deep learning becomes a key technique to learn and perform feature representation of data in latent space for many real world applications. In this paper, we propose deep clustering with robust autoencoder (DCRA), which jointly utilizes robust auto-encoder and deep clustering to perform feature representation and cluster assignments simultaneously. Multiple experiments using open public datasets have been conducted to evaluate our model’s performance. Our results show DCRA is capable of generating high quality clusters with high clustering accuracy of 90% above in high dimensional datasets. The decreasing training and test loss with increasing number of epochs also validates our results.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.