The availability of genome sequences, affordable mass spectrometers and high-resolution two-dimensional gels has made possible the identification of hundreds of proteins from many organisms by peptide mass fingerprinting. However, little attention has been paid to how information generated by these means can be utilised for detailed protein characterisation. Here we present an approach for the systematic characterisation of proteins using mass spectrometry and a software tool FindMod. This tool, available on the internet at http://www.expasy.ch/sprot/findmod.html, examines peptide mass fingerprinting data for mass differences between empirical and theoretical peptides. Where mass differences correspond to a post-translational modification, intelligent rules are applied to predict the amino acids in the peptide, if any, that might carry the modification. FindMod rules were constructed by examining 5153 incidences of post-translational modifications documented in the SWISS-PROT database, and for the 22 post-translational modifications currently considered (acetylation, amidation, biotinylation, C-mannosylation, deamidation, flavinylation, farnesylation, formylation, geranyl-geranylation, gamma-carboxyglutamic acids, hydroxylation, lipoylation, methylation, myristoylation, N-acyl diglyceride (tripalmitate), O-GlcNAc, palmitoylation, phosphorylation, pyridoxal phosphate, phospho-pantetheine, pyrrolidone carboxylic acid, sulphation) a total of 29 different rules were made. These consider which amino acids can carry a modification, whether the modification occurs on N-terminal, C-terminal or internal amino acids, and the type of organisms on which the modification can be found. We illustrate the utility of the approach with proteins from 2-D gels of Escherichia coli and sheep wool, where post-translational modifications predicted by FindMod were confirmed by MALDI post-source decay peptide fragmentation. As the approach is amenable to automation, it presents a potentially large-scale means of protein characterisation in proteome projects.
Read full abstract