In the literature, data visualization is extensively studied via diverse parametric probabilistic distributions for the exploration of continuous, binary, and counting data. An overview of the existing methods for non-symmetric data matrices is presented in an unified framework via the Bernoulli law and binary variables. An extension to continuous or counting variables is available by using instead any another univariate distribution such as the Poisson or Gaussian one. Several approaches are possible when the model is with a distribution on the rows, the columns, the row clusters, the column clusters, the cells, the blocks, or a transformed matrix of the distances from the pairs of rows or columns. The objective functions are presented with their full expressions in separated sections, one for each method: Kohonen's map and related methods of self-organizing maps, generative topographic mapping as a probabilistic self-organizing map, linear principal component analysis and related matricial methods (non-negative factorization, factorization), probabilistic parametric embedding, probabilistic latent semantic visualization, latent cluster position model, t-distributed stochastic neighbor embedding. The conclusion is a discussion of the contribution with perspectives.
Read full abstract