Abstract
In this paper we introduce an efficient algorithm for alignment of multiple large-scale biological networks. In this scheme, we first compute a probabilistic similarity measure between nodes that belong to different networks using a semi-Markov random walk model. The estimated probabilities are further enhanced by incorporating the local and the cross-species network similarity information through the use of two different types of probabilistic consistency transformations. The transformed alignment probabilities are used to predict the alignment of multiple networks based on a greedy approach. We demonstrate that the proposed algorithm, called SMETANA, outperforms many state-of-the-art network alignment techniques, in terms of computational efficiency, alignment accuracy, and scalability. Our experiments show that SMETANA can easily align tens of genome-scale networks with thousands of nodes on a personal computer without any difficulty. The source code of SMETANA is available upon request. The source code of SMETANA can be downloaded from http://www.ece.tamu.edu/~bjyoon/SMETANA/.
Highlights
The complicated interactions among numerous cellular constituents – such as DNAs, RNAs, and proteins – govern numerous complex cellular functions
We propose a novel method, called SMETANA (Semi-Markov random walk scores Enhanced by consistency Transformation for Accurate Network Alignment), for finding the maximum expected accuracy (MEA) alignment of large-scale biological networks
We compared the performance of SMETANA against four well-known multiple network alignment algorithms: IsoRankN [18], NetworkBLASTM (NBM) [21], Græmlin 2.0 [17], MI-GRAAL [35], C-GRAAL [36], AlignNemo [37], and PINALOG [38]
Summary
The complicated interactions among numerous cellular constituents – such as DNAs, RNAs, and proteins – govern numerous complex cellular functions. Thanks to the recent technological advances in high-throughput interaction measurement techniques, along with many text-mining tools developed to search the biomedical research literature for known molecular interactions, large-scale PPI networks are currently available for a number of model organisms, and biological network databases are still undergoing rapid expansion [4,5,6,7,8]. Availability of such large-scale interaction data has expedited comprehensive studies of biological networks, and the development of accurate and efficient computational techniques for network analysis is expected to lead to the discovery of novel biological knowledge. As demonstrated in many comparative genome studies, such a comparative approach can provide effective computational framework for identifying functional modules (e.g., signaling pathways or protein complexes) that are conserved across different networks [9]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.