Reliability of maximum spanning tree identification in correlation-based market networks

V.A Kalyagin,A.P Koldanov,P.A Koldanov

doi:10.1016/j.physa.2022.127482

Abstract

Maximum spanning tree is a popular tool in market network analysis. Large number of publications are devoted to the maximum spanning tree calculation and its interpretation for particular stock markets. Usually one use market data to calculate Pearson correlations between stock returns and construct a compete weighted graph, where weights of edges are given by calculated correlations. Then maximum spanning tree of the obtained network is calculated and its market interpretation is discussed. However, Pearson correlation is not only one similarity measure which can be used for market network analysis. Different measures of similarity will generate different market networks, and, as a consequence, different maximum spanning trees. The main goal of the present paper is to analyze the key points of this difference. We show that this is related with uncertainty (reliability) of maximum spanning tree identification in different networks. We study uncertainty in the framework of the concept of random variable network (RVN). We consider different correlation based networks in the large class of elliptical distributions. We show that true maximum spanning tree is the same in three correlation networks: Pearson correlation network, Fechner correlation network, and Kendall correlation network. It means, that from theoretical point of view there is no difference between maximum spanning trees in these networks. The observed difference between maximum spanning trees in different networks can be, therefore, explained by uncertainty of maximum spanning tree identification by observations. We argue that among different measures of uncertainty the FDR (False Discovery Rate) is the most appropriated to measure uncertainty (reliability) of maximum spanning tree identification. We investigate FDR of Kruskal algorithm for maximum spanning tree identification and show that reliability of maximum spanning tree identification is different in these three networks. In particular, for Pearson correlation network the FDR essentially depends on distribution of stock returns. We prove that for market network with Fechner correlation the FDR is non sensitive to the assumption on stock’s return distribution. Some interesting phenomena are discovered for Kendall correlation network. Our experiments show that FDR of Kruskal algorithm for maximum spanning tree identification in Kendall correlation network weakly depend on distribution and at the same time the value of FDR is almost the best in comparison with maximum spanning tree identification in other networks.

Full Text