Abstract
Using a recently discovered method for producing random symbol sequences with prescribed transition counts, we present an exact null hypothesis significance test (NHST) for mutual information between two random variables, the null hypothesis being that the mutual information is zero (i.e., independence). The exact tests reported in the literature assume that data samples for each variable are sequentially independent and identically distributed (iid). In general, time series data have dependencies (Markov structure) that violate this condition. The algorithm given in this paper is the first exact significance test of mutual information that takes into account the Markov structure. When the Markov order is not known or indefinite, an exact test is used to determine an effective Markov order.
Highlights
Mutual information is an information theoretic measure of dependency between two random variables [1]
To introduce the need for a significance test, suppose the random variables X and Y are the values obtained from the rolls of a pair of six-sided dice, each die independent from the other and likely to land on any of its six sides
The logic we are describing is that of a null hypothesis significance test (NHST) for mutual information, the null hypothesis being that the mutual information is zero
Summary
Mutual information is an information theoretic measure of dependency between two random variables [1]. To introduce the need for a significance test, suppose the random variables X and Y are the values obtained from the rolls of a pair of six-sided dice, each die independent from the other and likely to land on any of its six sides. The most probable value of mutual information is 0.3 bits/roll, which—if we did not know better—might seem significant considering that the total uncertainty in one die roll is log2 6 ≈ 2.585 bits. The true significance of I, can only be determined knowing the distribution I(X; Y ) for independent dice (solid line, Figure 1). Knowing this distribution, we would not regard a measurement of I = 0.3 as being significant, since the values of I around 0.3 are, the most probable to occur when X and Y are independent. The significance threshold for rejection is entirely up to the investigator to decide
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.