Abstract

Identifying protein complexes within a protein-protein interaction (PPI) networks is a crucial task in computational biology that helps to facilitate a better understanding of the cellular mechanisms it is possible to observe in various organisms. Datasets of predicted PPIs have been determined using high-throughput experimental technology. However, the datasets typically contain many spurious interactions. It is essential that these interactions, observed in the given datasets, are validated before they are employed to predict protein complexes. This paper describes the identification of missing interactome links in the PPI network as a way of improving the detection of protein complexes. The missing links have been identified by extracting several topological features. These are subsequently employed in conjunction with a two-class boosted decision-tree classifier to develop a machine-learning model that is capable of distinguishing between existing and non-existing interactome links. The model was trained on a PPI network that consisted of 1,622 proteins and 9,074 interactions, then tested on another PPI network that consisted of 1,430 proteins and 6,531 interactions. All 6,531 interactions were identified with a precision of 0.994 and a recall of 1. The model was also able to detect 37 novel interactions that were then validated using a STRING database of known and predicted PPIs. The detection of the protein complexes using CIusterONE was improved by the inclusion of the 37 novel interactions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.