Abstract

A Bayesian deep restricted Boltzmann-Kohonen architecture for data clustering termed deep restricted Boltzmann machine (DRBM)-ClustNet is proposed. This core-clustering engine consists of a DRBM for processing unlabeled data by creating new features that are uncorrelated and have large variance with each other. Next, the number of clusters is predicted using the Bayesian information criterion (BIC), followed by a Kohonen network (KN)-based clustering layer. The processing of unlabeled data is done in three stages for efficient clustering of the nonlinearly separable datasets. In the first stage, DRBM performs nonlinear feature extraction by capturing the highly complex data representation by projecting the feature vectors of d dimensions into n dimensions. Most clustering algorithms require the number of clusters to be decided a priori; hence, here, to automate the number of clusters in the second stage, we use BIC. In the third stage, the number of clusters derived from BIC forms the input for the KN, which performs clustering of the feature-extracted data obtained from the DRBM. This method overcomes the general disadvantages of clustering algorithms, such as the prior specification of the number of clusters, convergence to local optima, and poor clustering accuracy on nonlinear datasets. In this research, we use two synthetic datasets, 15 benchmark datasets from the UCI Machine Learning repository, and four image datasets to analyze the DRBM-ClustNet. The proposed framework is evaluated based on clustering accuracy and ranked against other state-of-the-art clustering methods. The obtained results demonstrate that the DRBM-ClustNet outperforms state-of-the-art clustering algorithms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.