Abstract

Currently, high-dimensional data is ubiquitous in data science, which necessitates the development of techniques to decompose and interpret such multidimensional (aka tensor) datasets. Finding a low dimensional representation of the data, that is, its inherent structure, is one of the approaches that can serve to understand the dynamics of low dimensional latent features hidden in the data. Moreover, decomposition methods with non-negative constraints are shown to extract more insightful factors. Nonnegative RESCAL is one such technique, particularly well suited to analyze self-relational data, such as dynamic networks found in international trade flows. Particularly, non-negative RESCAL computes a low dimensional tensor representation by finding the latent space containing multiple modalities. Furthermore, estimating the dimensionality of this latent space is crucial for extracting meaningful latent features. Here, to determine the dimensionality of the latent space with non-negative RESCAL, we propose a latent dimension determination method which is based on clustering of the solutions of multiple realizations of non-negative RESCAL decompositions. We demonstrate the performance of our model selection method on synthetic data. We then apply our method to decompose a network of international trade flows data from International Monetary Fund and shows that with a correct latent dimension determination, the resulting features are able to capture relevant empirical facts from economic literature.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call