Abstract

BackgroundViral infections are causing significant morbidity and mortality worldwide. Understanding the interaction patterns between a particular virus and human proteins plays a crucial role in unveiling the underlying mechanism of viral infection and pathogenesis. This could further help in prevention and treatment of virus-related diseases. However, the task of predicting protein–protein interactions between a new virus and human cells is extremely challenging due to scarce data on virus-human interactions and fast mutation rates of most viruses.ResultsWe developed a multitask transfer learning approach that exploits the information of around 24 million protein sequences and the interaction patterns from the human interactome to counter the problem of small training datasets. Instead of using hand-crafted protein features, we utilize statistically rich protein representations learned by a deep language modeling approach from a massive source of protein sequences. Additionally, we employ an additional objective which aims to maximize the probability of observing human protein–protein interactions. This additional task objective acts as a regularizer and also allows to incorporate domain knowledge to inform the virus-human protein–protein interaction prediction model.ConclusionsOur approach achieved competitive results on 13 benchmark datasets and the case study for the SARS-CoV-2 virus receptor. Experimental results show that our proposed model works effectively for both virus-human and bacteria-human protein–protein interaction prediction tasks. We share our code for reproducibility and future research at https://git.l3s.uni-hannover.de/dong/multitask-transfer.

Highlights

  • Virus infections cause an enormous and ever increasing burden on healthcare systems worldwide

  • The bacteria human Protein–protein interaction (PPI) prediction task We evaluate our method on three datasets for three human pathogenic bacteria: Bacillus anthracis (B1), Yersinia pestis (B2), and Francisella tularensis (B3), which were shared by Fatma et al [22]

  • 2 simpler variants of MultiTask Transfer (MTT): Towards ablation study, we evaluate two simpler variants: (i) SingleTask Transfer (STT), which is trained on a single objective of predicting pathogen-human PPI

Read more

Summary

Introduction

Virus infections cause an enormous and ever increasing burden on healthcare systems worldwide. Dong et al BMC Bioinformatics (2021) 22:572 between the virus and its host These interactions include the initial attachment of virus coat or envelope proteins to host membrane receptors, hijacking of the host translation and intracellular transport machineries resulting in replication, assembly and subsequent release of virus particles [2,3,4]. Understanding the interaction patterns between a particular virus and human proteins plays a crucial role in unveiling the underlying mechanism of viral infection and pathogenesis. This could further help in prevention and treatment of virus-related diseases. The task of predicting protein–protein interactions between a new virus and human cells is extremely challenging due to scarce data on virus-human interactions and fast mutation rates of most viruses

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call