Mass spectrometry (MS)-based methods such as covalent labeling MS are increasingly used to obtain information about protein structure. However, in contrast to other high-resolution structure determination methods, this information is not sufficient to deduce all atom coordinates and can only inform on certain elements of structure, such as solvent exposure of individual residues or location of protein-protein interfaces. However, easy and reliable translation of this information into accurate structural protein and protein complex models remains particularly challenging, due to a lack of standardized and automated methods. Computational methods are needed to predict high-resolution protein structures from the mass spectrometry (MS) data. My group develops algorithms within the Rosetta software package that use mass spectrometry data to guide protein structure prediction. We have successfully implemented the use of MS covalent labeling data into the Rosetta software to predict protein and protein complex structure guided by the experimental data. We developed algorithms to predict protein structure from hydroxyl radical and diethylpyrocarbonate (DEPC) labeling data. To evaluate the effectiveness of our covalent-labeling guided structure prediction algorithms, we benchmarked their performance using proteins of known structure for which experimental covalent labeling data exists. For all benchmark proteins, better quality models were built when MS data was used in the scoring. For modeling with hydroxyl radical labeling data, we measured exposure of labeled residues, while in the case of DEPC labeling we further assessed the local hydrophobic microenvironment of serine, threonine, and tyrosine residues. We are now extending this work to use MS data in deep learning-based predictions of protein structure.
Read full abstract