Abstract

In some applications and in order to address real-world situations better, data may be more complex than simple numerical vectors. In some examples, data can be known only through their pairwise dissimilarities or through multiple dissimilarities, each of them describing a particular feature of the data set. Several variants of the Self-Organizing Map (SOM) algorithm were introduced to generalize the original algorithm to the framework of dissimilarity data. Whereas median SOM is based on a rough representation of the prototypes, relational SOM allows representing these prototypes by a virtual linear combination of all elements in the data set, referring to a pseudo-Euclidean framework. In the present article, an on-line version of relational SOM is introduced and studied. Similar to the situation in the Euclidean framework, this on-line algorithm provides a better organization and is much less sensible to prototype initialization than standard (batch) relational SOM. In a more general case, this stochastic version allows us to integrate an additional stochastic gradient descent step in the algorithm which can tune the respective weights of several dissimilarities in an optimal way: the resulting multiple relational SOM thus has the ability to integrate several sources of data of different types, or to make a consensus between several dissimilarities describing the same data. The algorithms introduced in this paper are tested on several data sets, including categorical data and graphs. On-line relational SOM is currently available in the R package SOMbrero that can be downloaded at http://sombrero.r-forge.r-project.org/ or directly tested on its Web User Interface at http://shiny.nathalievilla.org/sombrero.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call