Abstract

Predicting the biological function potential of Post-Translational Modifications (PTMs) is becoming increasingly important in light of the exponential increase in available PTM data from high-throughput proteomics. We developed Structural Analysis of PTM Hotspots (SAPH-ire) -- a quantitative PTM ranking method that integrates experimental PTM observations, sequence conservation, protein structure and interaction data to allow rank order comparisons within or between protein families. SAPH-ire utilizes low-level data fusion and integration of derived data from several disparate public repositories (Figure. 1). We have previously demonstrated that SAPH-ire analysis is predictive for PTMs with biological function [1]. Here, we applied SAPH-ire to the study of experimentally verified PTMs across 2,861 protein families. A total of 75,966 experimentally verified PTMs were aligned into 51,663 unique hotspots, over 3,000 of which have a known and cited biological function or response. The hotspots were analyzed in the context of family-representative unique protein structures found in the protein databank (PDB), revealing thousands of high-ranking hotspots for which a functional impact has not yet been determined -- representing putative regulatory elements. The scoring metrics employed by SAPH-ire reveal a 2- to 6-fold enrichment in function potential ranking for PTM sites with demonstrated biological function. These results further demonstrate the utility of integrating protein structural and sequence features into PTM prioritization schemes that can improve the analysis and functional power of modification-specific proteomics data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call