Abstract

ABSTRACT Data visualization is achieved by minimizing distortion resulting from observing the relationships between data points. Typically, this is accomplished by estimating latent data points, designed to accurately reflect the pairwise relationships between observed data points. The distortion masks the true pairwise relationships between data points, represented by the latent data. Distortion can be modeled as masking the pairwise distances between data points (i.e., pair-wise distance distortion) or, alternatively, as masking dissimilarity measures between data points (i.e., pair-wise dissimilarity distortion). Multidimensional scaling methods are usually used to model pairwise distance distortion. This class of methods includes principal components analysis, which minimizes the global distortion between observed and latent data. We employ an algorithm which we call stepwise forward selection, for purposes of identifying appropriate initializing values and determining the appropriate dimensionality of the latent data space. We model pair-wise dissimilarity distortion using mixtures of pairwise difference factor-analysis statistical models. Our approach is different from that of probabilistic principal components (e.g., Bishop and Tipping, 1983) where noise masks the relationship between each individual data point and its latent counterpart. By contrast, in our approach, noise masks pairwise dissimilarities between data points and analogous latent quantities; we will see below that this difference in approach allows us to build in some extra flexibility into the interpretation and modeling of high-dimensional data. Our approach is similar in spirit to the approach employed in relational Markov models (e.g., Koller, 1999). We show that the pair wise factor-analysis models frequently better fit the data because they allow for direct modeling of pair-wise dissimilarities between data points.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call