Abstract

BackgroundUnderstanding cellular systems requires the knowledge of a protein's subcellular localization (SCL). Although experimental and predicted data for protein SCL are archived in various databases, SCL prediction remains a non-trivial problem in genome annotation. Current SCL prediction tools use amino-acid sequence features and text mining approaches. A comprehensive analysis of protein SCL in human PPI and metabolic networks for various subcellular compartments is necessary for developing a robust SCL prediction methodology.ResultsBased on protein-protein interaction (PPI) and metabolite-linked protein interaction (MLPI) networks of proteins, we have compared, contrasted and analysed the statistical properties across different subcellular compartments. We integrated PPI and metabolic datasets with SCL information of human proteins from LOCATE and GOA (Gene Ontology Annotation) and estimated three statistical properties: Chi-square (χ2) test, Paired Localisation Correlation Profile (PLCP) and network topological measures. For the PPI network, Pearson's chi-square test shows that for the same SCL category, twice as many interacting protein pairs are observed than estimated when compared to non-interacting protein pairs (χ2 = 1270.19, P-value < 2.2 × 10-16), whereas for MLPI, metabolite-linked protein pairs having the same SCL are observed 20% more than expected, compared to non-metabolite linked proteins (χ2 = 110.02, P-value < 2.2 x10-16). To address the issue of proteins with multiple SCLs, we have specifically used the PLCP (Pair Localization Correlation Profile) measure. PLCP analysis revealed that protein interactions are majorly restricted to the same SCL, though significant cross-compartment interactions are seen for nuclear proteins. Metabolite-linked protein pairs are restricted to specific compartments such as the mitochondrion (P-value < 6.0e-07), the lysosome (P-value < 4.7e-05) and the Golgi apparatus (P-value < 1.0e-15). These findings indicate that the metabolic network adds value to the information in the PPI network for the localisation process of proteins in human subcellular compartments.ConclusionsThe MLPI network differs significantly from the PPI network in its SCL distribution. The PPI network shows passive protein interaction, possibly due to its high false positive rate, across different subcellular compartments, which seem to be absent in the MLPI network, as the MLPI network has evolved to maintain high substrate specificity for proteins.

Highlights

  • Understanding cellular systems requires the knowledge of a protein’s subcellular localization (SCL)

  • We will refer to the metabolite-linked protein interaction network as the metabolic network or MLPI, and the gene ontology annotation as GOA

  • The network analysis showed that there is significant difference between the topological properties measured in the human protein-protein interaction (PPI) and metabolic networks

Read more

Summary

Introduction

Understanding cellular systems requires the knowledge of a protein’s subcellular localization (SCL). A comprehensive analysis of protein SCL in human PPI and metabolic networks for various subcellular compartments is necessary for developing a robust SCL prediction methodology. Most of the cellular functions critical to the cell’s survival are performed by proteins inside the cell. The functions of proteins are determined by specific physico-chemical environment present inside various compartments or organelles. It is important to identify the SCL of each protein, for understanding its functional and cellular role. The number is slightly better, with 34.1% having SCL annotations (Table 1). There is a huge gap between protein sequences with and without SCL annotation, necessitating computational approaches to predict the SCL from sequence information

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call