Learning the Number of Clusters in Self Organizing Map

Guenael Cabanes,Younes Bennani

doi:10.5772/9164

Abstract

The Self-Organizing Map (SOM: Kohonen (1984, 2001)) is a neuro-computational algorithm to map high-dimensional data to a two-dimensional space through a competitive and unsupervised learning process. Self-Organizing Maps differ from other artificial neural networks in the sense that they use a neighborhood function to preserve the topological properties of the input space. This unsupervised learning algorithm is a popular nonlinear technique for dimensionality reduction and data visualization. The SOM is often used as a first phase for unsupervised classification (i.e. clustering). Clustering methods are able to perform an automatic detection of relevant sub-groups or clusters in unlabeled data sets, when one does not have prior knowledge about the hidden structure of these data. Patterns in the same cluster should be similar to each other, while patterns in different clusters should not (internal homogeneity and the external separation). Clustering plays an indispensable role for understanding various phenomena described by data sets. A clustering problem can be defined as the task of partitioning a set of objects into a collection of mutually disjoint subsets. Clustering is a segmentation problem which is considered as one of the most challenging problems in unsupervised learning. Various approaches have been proposed to solve the problem (Jain & Dubes, 1988). An efficient method to grouping problems is based on the learning of a Self-Organizing Map. In the first phase of the process, the standard SOM approach is used to compute a set of reference vectors (prototypes) representing local means of the data. In the second phase, the obtained prototypes are grouped to form the final partitioning using a traditional clustering method (e.g. K-means or hierarchical methods). Such an approach is called a twolevel clustering method. In this work, we focus particular attention on two-level clustering algorithms. One of the most crucial questions in many real-world cluster applications is how to determine a suitable number of clusters K, also known as the model selection problem. Without a priori knowledge there is no simple way of knowing that number. The purpose of our work is to provide a simultaneous two-level clustering approach using SOM, by learning at the same time the structure of the data and its segmentation, using both distance and density information. This new clustering algorithm assumes that a cluster is a dense region of objects surrounded by a region of low density (Yue et al., 2004; Ultsch, 2005; Ocsa et al., 2007; Pamudurthy et al., 2007). This approach is very effective when the clusters are 2

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Learning the Number of Clusters in Self Organizing Map

Abstract

Talk to us

Similar Papers

Lead the way for us

Publication Date: Apr 1, 2010
Citations: 26	License type: cc-by-nc-sa

Similar Papers

Automatically Determining the Number of Clusters in Unlabeled Data Sets
Liang Wang ... Christopher Leckie
IEEE Transactions on Knowledge and Data Engineering | VOL. 21
Liang Wang, et. al.Liang Wang ... Christopher Leckie
01 Mar 2009
IEEE Transactions on Knowledge and Data Engineering | VOL. 21

Deep Near Unsupervised Learning for Data Analysis in Metabolomics, Drug-Drug Interaction Discovery and Human Gait Recognition
Saman K Halgamuge
-
Saman K HalgamugeSaman K Halgamuge
01 Jan 2015
01 Jan 2015

Enhanced Dark Block Extraction Method Performed Automatically to Determine the Number of Clusters in Unlabeled Data Sets
Puniethaa Prabhu ... K Duraiswamy
International Journal of Computers Communications & Control | VOL. 8
Puniethaa Prabhu, et. al.Puniethaa Prabhu ... K Duraiswamy
18 Feb 2013
International Journal of Computers Communications & Control | VOL. 8

Simulation Algorithm Benefits by Connecting Geostatistics With Unsupervised Learning
Adam Wilson
Journal of Petroleum Technology | VOL. 70
Adam WilsonAdam Wilson
01 Oct 2018
Journal of Petroleum Technology | VOL. 70

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Learning the Number of Clusters in Self Organizing Map

Abstract

Talk to us

Similar Papers