Abstract

BackgroundHigh throughput experiments have generated a significantly large amount of protein interaction data, which is being used to study protein networks. Studying complete protein networks can reveal more insight about healthy/disease states than studying proteins in isolation. Similarly, a comparative study of protein–protein interaction (PPI) networks of different species reveals important insights which may help in disease analysis and drug design. The study of PPI network alignment can also helps in understanding the different biological systems of different species. It can also be used in transfer of knowledge across different species. Different aligners have been introduced in the last decade but developing an accurate and scalable global alignment algorithm that can ensures the biological significance alignment is still challenging.ResultsThis paper presents a novel global pairwise network alignment algorithm, SAlign, which uses topological and biological information in the alignment process. The proposed algorithm incorporates sequence and structural information for computing biological scores, whereas previous algorithms only use sequence information. The alignment based on the proposed technique shows that the combined effect of structure and sequence results in significantly better pairwise alignments. We have compared SAlign with state-of-art algorithms on the basis of semantic similarity of alignment and the number of aligned nodes on multiple PPI network pairs. The results of SAlign on the network pairs which have high percentage of proteins with available structure are 3–63% semantically better than all existing techniques. Furthermore, it also aligns 5–14% more nodes of these network pairs as compared to existing aligners. The results of SAlign on other PPI network pairs are comparable or better than all existing techniques. We also introduce hbox {SAlign}^{mathrm{mc}}, a Monte Carlo based alignment algorithm, that produces multiple network alignments with similar semantic similarity. This helps the user to pick biologically meaningful alignments.ConclusionThe proposed algorithm has the ability to find the alignments that are more biologically significant/relevant as compared to the alignments of existing aligners. Furthermore, the proposed method is able to generate alternate alignments that help in studying different genes/proteins of the specie.

Highlights

  • High throughput experiments have generated a significantly large amount of protein interaction data, which is being used to study protein networks

  • Information Content (IC) based methods are dependent on the annotation database which is biased towards the proteins or genes which are more studied by the researchers [23]

  • GoSemSim is used by most recent studies for semantic similarity calculation as it uses the latest version of Gene Ontology (GO) database [31,32,33]

Read more

Summary

Introduction

High throughput experiments have generated a significantly large amount of protein interaction data, which is being used to study protein networks. A comparative study of protein–protein interaction (PPI) networks of different species reveals important insights which may help in disease analysis and drug design. The study of PPI network alignment can helps in understanding the different biological systems of different species. PPI networks of two species can be compared to detect evolutionary conserved interactions This comparison highlights the structurally and functionally conserved parts of the two networks. It can be helpful in finding unidentified interactions [1, 2] and in drug design [3, 4]. It is crucial that the methods used by researchers to align PPI networks are precise and accurate

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call