Abstract
In this work, we applied a stochastic simulation methodology to quantify the power of the detection of outlying mixture components of a stochastic model, when applying a reduced-dimension clustering technique such as Self-Organizing Maps (SOMs). The essential feature of SOMs, besides dimensional reduction into a discrete map, is the conservation of topology. In SOMs, two forms of learning are applied: competitive, by sequential allocation of sample observations to a winning node in the map, and cooperative, by the update of the weights of the winning node and its neighbors. By means of cooperative learning, the conservation of topology from the original data space to the reduced (typically 2D) map is achieved. Here, we compared the performance of one- and two-layer SOMs in the outlier representation task. The same stratified sampling was applied for both the one-layer and two-layer SOMs; although, stratification would only be relevant for the two-layer setting—to estimate the outlying mixture component detection power. Two distance measures between points in the map were defined to quantify the conservation of topology. The results of the experiment showed that the two-layer setting was more efficient in outlier detection while maintaining the basic properties of the SOM, which included adequately representing distances from the outlier component to the remaining ones.
Highlights
The purpose of this paper was to apply stochastic simulation for a better understanding of the possibilities of outlier component detection in a Gaussian mixture using oneand two-layer Self Organizing Maps (SOMs)
Given that in many real problems, it is interesting to keep a representation of the outlier in the map while respecting the essence of the standard SOM, we studied how the SOM is able to do so in some situations as simulated in our experiment and checked if our sense that two-layer SOMs would do better in such outlier representations and detection than onelayer ones, as well as an adequate “between outlier component and remaining component distance” representation, was correct
The SOM is a neural network that allows us to project a high-dimensional vector space onto a low-dimensional topology integrated by a set of different nodes or neurons displayed as a grid
Summary
The purpose of this paper was to apply stochastic simulation for a better understanding of the possibilities of outlier component detection in a Gaussian mixture using oneand two-layer Self Organizing Maps (SOMs). Since the stratum including the outlier will have several nodes associated with it in its first layer map, one of these nodes may adequately represent the outlying component If so, it may receive a node of its own in the second (final) map. Note that when comparing two SOMs in the preservation of topology, the distance in the original space is the same for both, so one just has to compare the distances in the SOM Should it just depend on the integer value map coordinates or on the corresponding winning node weights (centroids)?
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have