Probability Weighted Ensemble Transfer Learning for Predicting Interactions between HIV-1 and Human Proteins

Suyu Mei

doi:10.1371/journal.pone.0079606

Abstract

Reconstruction of host-pathogen protein interaction networks is of great significance to reveal the underlying microbic pathogenesis. However, the current experimentally-derived networks are generally small and should be augmented by computational methods for less-biased biological inference. From the point of view of computational modelling, data scarcity, data unavailability and negative data sampling are the three major problems for host-pathogen protein interaction networks reconstruction. In this work, we are motivated to address the three concerns and propose a probability weighted ensemble transfer learning model for HIV-human protein interaction prediction (PWEN-TLM), where support vector machine (SVM) is adopted as the individual classifier of the ensemble model. In the model, data scarcity and data unavailability are tackled by homolog knowledge transfer. The importance of homolog knowledge is measured by the ROC-AUC metric of the individual classifiers, whose outputs are probability weighted to yield the final decision. In addition, we further validate the assumption that only the homolog knowledge is sufficient to train a satisfactory model for host-pathogen protein interaction prediction. Thus the model is more robust against data unavailability with less demanding data constraint. As regards with negative data construction, experiments show that exclusiveness of subcellular co-localized proteins is unbiased and more reliable than random sampling. Last, we conduct analysis of overlapped predictions between our model and the existing models, and apply the model to novel host-pathogen PPIs recognition for further biological research.

Highlights

Accurate mapping of protein interactome is essential to reveal protein functions, biological processes, signal transduction pathways
The work [14] explained the reasons why gene ontology (GO) feature outperformed the other feature information based on the observations: (1) proteins localized in identical cellular compartments are more likely to interact than are proteins that reside in spatially distant compartments; (2) proteins that participate in similar biological processes or perform similar molecular functions are likely to interact
Data unavailability and negative data sampling are the three major concerns to be addressed for the computational reconstruction of HIV-human protein-protein interactions (PPI) networks

Summary

Introduction

Accurate mapping of protein interactome is essential to reveal protein functions, biological processes, signal transduction pathways. Wuchty S [10] combined sequence k-mer, interlog, gene ontology and signal transduction pathways to predict and validate the protein interactions between Plasmodium falciparum and Homo sapiens. In the latter two models, the validation information (gene co-expression, signal transduction pathways, gene ontology) was used to manually filter the predicted PPIs. It has been claimed that gene ontology (GO) is one of the strongest indicators for host-pathogen PPI prediction [6] and intra-species PPI prediction [3,4,11,12,13,14,15,16,17] among the catalog of feature information. The three aspects of gene ontology (cellular compartments, biological processes and molecular functions) are informative to indicate PPI

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLoS ONE	Publication Date: Nov 18, 2013
Citations: 89	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Probability Weighted Ensemble Transfer Learning for Predicting Interactions between HIV-1 and Human Proteins

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE

Lead the way for us

Similar Papers

An ensemble deep learning model for exhaust emissions prediction of heavy oil-fired boiler combustion
Zhezhe Han ... Chuanlong Xu
Fuel | VOL. 308
Zhezhe Han, et. al.Zhezhe Han ... Chuanlong Xu
15 Sep 2021
Fuel | VOL. 308

Developing an ensemble machine learning model for early prediction of sepsis-associated acute kidney injury
Luming Zhang ... Jun Lyu
iScience | VOL. 25
Luming Zhang, et. al.Luming Zhang ... Jun Lyu
12 Aug 2022
iScience | VOL. 25

An Ensemble Deep Learning Model for Short-Term Road Surface Temperature Prediction
Bingyou Dai ... Feng Zhu
Journal of Transportation Engineering, Part B: Pavements | VOL. 149
Bingyou Dai, et. al.Bingyou Dai ... Feng Zhu
01 Mar 2023
Journal of Transportation Engineering, Part B: Pavements | VOL. 149

Host–pathogen protein interactions predicted by comparative modeling
Fred P Davis ... Andrej Sali
Protein Science | VOL. 16
Fred P Davis, et. al.Fred P Davis ... Andrej Sali
01 Dec 2007
Protein Science | VOL. 16

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Probability Weighted Ensemble Transfer Learning for Predicting Interactions between HIV-1 and Human Proteins

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE