Abstract

Tree-shaped datasets have arisen in various research and industrial fields, such as gene expression data measured on a cell lineage tree and information spreading on tree-shaped paths. Certain correlation measure between two tree-shaped datasets, i.e., how the values increase or decrease together along corresponding paths of the two trees, is desired; but the tree topology prohibits the use of classical vector-based correlation measures such as Pearson correlation coefficient. To this end, a statistical framework for measuring such tree correlation is proposed. As a specific model in this framework, a parametric model based on bivariate Gaussian distributions is provided, and a Bayesian approach for parameter estimation is introduced. The model allows the coupling degree of corresponding nodes to change with the depth of the tree. It provides an intuitive mapping of the trend similarity of the values along two trees to the classical Pearson correlation. A Metropolis-within-Gibbs algorithm is used to obtain the posterior estimates. Extensive simulations and in-depth sensitivity analyses are performed to demonstrate the validity and robustness of the method. Furthermore, an application to embryonic gene expression datasets shows that this tree similarity measure aligns well with the biological properties.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.