Abstract

Using a recently discovered method for producing random symbol sequences with prescribed transition counts, we present an exact null hypothesis significance test (NHST) for mutual information between two random variables, the null hypothesis being that the mutual information is zero (i.e., independence). The exact tests reported in the literature assume that data samples for each variable are sequentially independent and identically distributed (iid). In general, time series data have dependencies (Markov structure) that violate this condition. The algorithm given in this paper is the first exact significance test of mutual information that takes into account the Markov structure. When the Markov order is not known or indefinite, an exact test is used to determine an effective Markov order.

Highlights

  • Mutual information is an information theoretic measure of dependency between two random variables [1]

  • To introduce the need for a significance test, suppose the random variables X and Y are the values obtained from the rolls of a pair of six-sided dice, each die independent from the other and likely to land on any of its six sides

  • The logic we are describing is that of a null hypothesis significance test (NHST) for mutual information, the null hypothesis being that the mutual information is zero

Read more

Summary

Introduction

Mutual information is an information theoretic measure of dependency between two random variables [1]. To introduce the need for a significance test, suppose the random variables X and Y are the values obtained from the rolls of a pair of six-sided dice, each die independent from the other and likely to land on any of its six sides. The most probable value of mutual information is 0.3 bits/roll, which—if we did not know better—might seem significant considering that the total uncertainty in one die roll is log2 6 ≈ 2.585 bits. The true significance of I, can only be determined knowing the distribution I(X; Y ) for independent dice (solid line, Figure 1). Knowing this distribution, we would not regard a measurement of I = 0.3 as being significant, since the values of I around 0.3 are, the most probable to occur when X and Y are independent. The significance threshold for rejection is entirely up to the investigator to decide

Generating the Mutual Information Distribution from Surrogates
Accounting for Markov Structure
Finding the Markov Order
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.