Abstract
BackgroundCellular activities are governed by the physical and the functional interactions among several proteins involved in various biological pathways. With the availability of sequenced genomes and high-throughput experimental data one can identify genome-wide protein-protein interactions using various computational techniques. Comparative assessments of these techniques in predicting protein interactions have been frequently reported in the literature but not their ability to elucidate a particular biological pathway.MethodsTowards the goal of understanding the prediction capabilities of interactions among the specific biological pathway proteins, we report the analyses of 14 biological pathways of Escherichia coli catalogued in KEGG database using five protein-protein functional linkage prediction methods. These methods are phylogenetic profiling, gene neighborhood, co-presence of orthologous genes in the same gene clusters, a mirrortree variant, and expression similarity.ConclusionsOur results reveal that the prediction of metabolic pathway protein interactions continues to be a challenging task for all methods which possibly reflect flexible/independent evolutionary histories of these proteins. These methods have predicted functional associations of proteins involved in amino acids, nucleotide, glycans and vitamins & co-factors pathways slightly better than the random performance on carbohydrate, lipid and energy metabolism. We also make similar observations for interactions involved among the environmental information processing proteins. On the contrary, genetic information processing or specialized processes such as motility related protein-protein linkages that occur in the subset of organisms are predicted with comparable accuracy. Metabolic pathways are best predicted by using neighborhood of orthologous genes whereas phyletic pattern is good enough to reconstruct central dogma pathway protein interactions. We have also shown that the effective use of a particular prediction method depends on the pathway under investigation. In case one is not focused on specific pathway, gene expression similarity method is the best option.
Highlights
Proteins are responsible for almost every cellular function of an organism such as behavior, metabolic activities and other phenotypic traits
We evaluated 969,528 pairs among 1,393 E. coli proteins for which pathway memberships were recorded in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database [35]
For Receiver Operator Characteristics (ROC) analysis, we considered the protein pair True Positives (TP), if both the proteins belonged to the pathway under consideration and all other pathway pairs were treated as true negatives (Table 1)
Summary
Proteins are responsible for almost every cellular function of an organism such as behavior, metabolic activities and other phenotypic traits. Gene Neighbor (GN) method which identifies rearranged genes based on the genomic neighborhood of the orthologous genes independent of directionality (gene order) while Gene Cluster (GC) considers co-directional proximity of orthologous genes [17,18,19,20,21,22] There is another class of methods, called as mirrortree type method, based on the similarity of phylogenetic trees of two interacting protein families [23,24,25]. With the availability of sequenced genomes and high-throughput experimental data one can identify genome-wide protein-protein interactions using various computational techniques Comparative assessments of these techniques in predicting protein interactions have been frequently reported in the literature but not their ability to elucidate a particular biological pathway
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.