Abstract

Inferring linear dependence between time series is central to our understanding of natural and artificial systems. Unfortunately, the hypothesis tests that are used to determine statistically significant directed or multivariate relationships from time-series data often yield spurious associations (Type I errors) or omit causal relationships (Type II errors). This is due to the autocorrelation present in the analysed time series -- a property that is ubiquitous across diverse applications, from brain dynamics to climate change. Here we show that, for limited data, this issue cannot be mediated by fitting a time-series model alone (e.g., in Granger causality or prewhitening approaches), and instead that the degrees of freedom in statistical tests should be altered to account for the effective sample size induced by cross-correlations in the observations. This insight enabled us to derive modified hypothesis tests for any multivariate correlation-based measures of linear dependence between covariance-stationary time series, including Granger causality and mutual information with Gaussian marginals. We use both numerical simulations (generated by autoregressive models and digital filtering) as well as recorded fMRI-neuroimaging data to show that our tests are unbiased for a variety of stationary time series. Our experiments demonstrate that the commonly used $F$- and $\chi^2$-tests can induce significant false-positive rates of up to $100\%$ for both measures, with and without prewhitening of the signals. These findings suggest that many dependencies reported in the scientific literature may have been, and may continue to be, spuriously reported or missed if modified hypothesis tests are not used when analysing time series.

Highlights

  • Linear dependence measures such as Pearson correlation, canonical correlation analysis, and Granger causality are used in a broad range of scientific domains to investigate the complex relationships in both natural and artificial processes

  • We have shown that the autocorrelation exhibited in covariance-stationary time-series data induces bias in the hypothesis tests of a broad class of linear dependence measures

  • By framing different dependence measures in unified theoretical terms, we provide the first demonstration of how Bartlett’s formula can be applied to derive unbiased hypothesis tests, termed modified -tests, for mutual information for both univariate and multivariate time-series data

Read more

Summary

INTRODUCTION

Linear dependence measures such as Pearson correlation, canonical correlation analysis, and Granger causality are used in a broad range of scientific domains to investigate the complex relationships in both natural and artificial processes. In this work we bridge this gap by leveraging the concept of the effective sample size to derive hypothesis tests for any correlation-based measure of linear dependence between covariance-stationary time series Our experiments involve generating samples from two first-order independent AR models and iteratively filtering the output signal such that the autocorrelation is increased for both time series; this simulates empirical analysis in practice, and allows for the process parameters to be modified while ensuring that the null hypothesis (of no interprocess dependence) is not violated We perform these experiments for mutual information and Granger causality in their unconditional, conditional, and multivariate forms. Implementation of our approach will enable correct inference of linear relationships within complex systems across myriad scientific applications

MEASURES OF LINEAR DEPENDENCE
Cross-correlation and autocorrelation
Partial correlation
Wilks’ criterion and canonical correlations
Mutual information
Granger causality
MODIFIED TESTS FOR PARTIAL CORRELATION
Modified Student’s t-test for partial correlation
Modified F-test for partial correlation
MODIFIED -TESTS
MODIFIED TESTS FOR MUTUAL INFORMATION
Two time series
Multiple time series
MODIFIED TESTS FOR GRANGER CAUSALITY
NUMERICAL VALIDATION
Mutual information tests for bivariate time series
Conditional mutual information tests for bivariate time series
Mutual information tests for multivariate time series
Granger causality tests for bivariate time series
Granger causality tests for multivariate time series
VIII. EFFECT OF PREWHITENING
CASE STUDY
DISCUSSION
Asymptotic likelihood-ratio test
Finite-sample F-test
Surrogate-distribution tests
Drawing inferences
Numerical simulations
Findings
Human Connectome Project
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call