Abstract
How do we integratively profile large-scale multi-platform genomic data that are high dimensional and sparse? Furthermore, how can we incorporate prior knowledge, such as the association between genes, in the analysis systematically to find better latent relationships? To solve this problem, we propose a Scalable Network Constrained Tucker decomposition method (SNeCT). SNeCT adopts parallel stochastic gradient descent approach on the proposed parallelizable network constrained optimization function. SNeCT decomposition is applied to a tensor constructed from a large scale multi-platform multi-cohort cancer data, PanCan12, constrained on a network built from PathwayCommons database. The decomposed factor matrices are applied to stratify cancers, to search for top- k similar patients given a new patient, and to illustrate how the matrices can be used to identify significant genomic patterns in each patient. In the stratification test, combined twelve-cohort data is clustered to form thirteen subclasses. The similarity of the top- k patient to the query was high for 23 clinical features, including estrogen/progesterone receptor statuses of BRCA patients with average precision value ranges from 0.72 to 0.86 and from 0.68 to 0.86, respectively. We also illustrate how the factor matrices can be used for identifying significant patterns for each patient. Resources are available at: https://github.com/leesael/SNeCT.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE/ACM transactions on computational biology and bioinformatics
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.