We present the NetSci program-an open-source scientific software package designed for estimating mutual information (MI) between data sets using GPU acceleration and a k-nearest-neighbor algorithm. This approach significantly enhances calculation speed, achieving improvements of several orders of magnitude over traditional CPU-based methods, with data set size limits dictated only by available hardware. To validate NetSci, we accurately compute MI for an analytically verifiable two-dimensional Gaussian distribution and replicate the generalized correlation (GC) analysis previously conducted on the B1 domain of protein G. We also apply NetSci to molecular dynamics simulations of the Sarcoendoplasmic Reticulum Calcium-ATPase (SERCA) pump, exploring the allosteric mechanisms and pathways influenced by ATP and 2'-deoxy-ATP (dATP) binding. Our analysis reveals distinct allosteric effects induced by ATP compared to dATP, with predicted information pathways from the bound nucleotide to the calcium-binding domain differing based on the nucleotide involved. NetSci proves to be a valuable tool for estimating MI and GC in various data sets and is particularly effective for analyzing intraprotein communication and information transfer.
Read full abstract