Abstract

We determine the number of statistically significant factors in a high dimensional predictive model of cryptocurrencies using a random matrix test. The applied predictive model is of the reduced rank regression (RRR) type; in particular, we choose a flavor that can be regarded as canonical correlation analysis (CCA). A variable selection of hourly cryptocurrencies is performed using the Symbolic estimation of Transfer Entropy (STE) measure from information theory. In simulated studies, STE shows better performance compared to the Granger causality approach when considering a nonlinear system and a linear system with many drivers. In the application to cryptocurrencies, the directed graph associated to the variable selection shows a robust pattern of predictor and response clusters, where the community detection was contrasted with the modularity approach. Also, the centralities of the network discriminate between the two main types of cryptocurrencies, i.e., coins and tokens. On the factor determination of the predictive model, the result supports retaining more factors contrary to the usual visual inspection, with the additional advantage that the subjective element is avoided. In particular, it is observed that the dynamic behavior of the number of factors is moderately anticorrelated with the dynamics of the constructed composite index of predictor and response cryptocurrencies. This finding opens up new insights for anticipating possible declines in cryptocurrency prices on exchanges. Furthermore, our study suggests the existence of specific-predictor and specific-response factors, where only a small number of currencies are predominant.

Highlights

  • It is of fundamental interest to determine the proper number of components in a multivariate model because this allows for the attribution of explanatory meaning to each factor based on economic theory

  • To confirm that this variable selection makes sense, we have found the structure of the associated clusters through the modularity approach

  • Random matrices seem to be a promising tool for performing factor determination in financial and economic problems

Read more

Summary

Introduction

It is of fundamental interest to determine the proper number of components in a multivariate model because this allows for the attribution of explanatory meaning to each factor based on economic theory. The relevant assumption here is that the data must follow a normal distribution, which usually is untrue for financial time series in the high-frequency domain [14] Considering this problem, Burda et al [15, 16] have derived a heavy-tail limit distributions of eigenvalues based on the framework of random matrices. In [39], the multivariate version of symbolic transfer entropy has been tested, and it has been shown that it can be applicable to nonstationary time series in mean and variance and is even unaffected by the existence of outliers and vector autoregressive filtering Another advantage of using the symbolic approach is that under some circumstances, there exists a null hypothesis regarding the distribution that can be used to measure the absence of a direct flow of information. This model characterizes the heavy-tail behavior of financial time series

A ten-variable system with overlapping linear drivers separated in two blocks
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call