Recently, we described a fast self-organizing algorithm for embedding a set of objects into a low-dimensional Euclidean space in a way that preserves the intrinsic dimensionality and metric structure of the data [Proc. Natl. Acad. Sci. U.S.A. 99 (2002) 15869–15872]. The method, called stochastic proximity embedding (SPE), attempts to preserve the geodesic distances between the embedded objects, and scales linearly with the size of the data set. SPE starts with an initial configuration, and iteratively refines it by repeatedly selecting pairs of objects at random, and adjusting their coordinates so that their distances on the map match more closely their respective proximities. Here, we describe an alternative update rule that drastically reduces the number of calls to the random number generator and thus improves the efficiency of the algorithm.
Read full abstract