Abstract

Pathogens have specialized proteins to invade and infect the host by interacting with the host proteins. Despite several experimental studies, knowledge about host-pathogen interactions (HPIs) is limited. However, this knowledge is essential to understand disease mechanism and identify potential targets for disease prevention and intervention. In this study, we propose a pipeline for identifying host-pathogen protein-protein interactions. The pipeline consists of a biological knowledge-based filter, a domain-based statistical filter followed by a sequence-signature based machine learning method. Interspecies protein-protein interaction data between eukaryotic pathogens and human was used to build the domain-based statistical model. Known host-pathogen interactions of all eukaryotic pathogens from HPIDB and non-interacting protein data from Negatome were used as positive and negative training sets, respectively to train the machine learning model. We applied our pipeline to predict HPIs between human and malarial parasite, P. falciparum. Several biologically relevant features like tissue specificity, protein annotations and functions were used to construct a primary list of possible HPIs. Next, the statistical and machine learning based models were used as filters on the initial list to predict novel protein-protein interactions between human and P. falciparum during intra-erythrocytic stages. We have predicted several HPIs that are involved in host erythrocyte cytoskeleton remodeling, signaling and immune response. The proposed method can be used to find novel HPIs between human cells and any eukaryotic pathogens.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call