Abstract

BackgroundTop-down mass spectrometry has unique advantages in identifying proteoforms with multiple post-translational modifications and/or unknown alterations. Most software tools in this area search top-down mass spectra against a protein sequence database for proteoform identification. When the species studied in a mass spectrometry experiment lacks its proteome sequence database, a homologous protein sequence database can be used for proteoform identification. The accuracy of homologous protein sequences affects the sensitivity of proteoform identification and the accuracy of mass shift localization.ResultsWe tested TopPIC, a commonly used software tool for top-down mass spectral identification, on a top-down mass spectrometry data set of Escherichia coli K12 MG1655, and evaluated its performance using an Escherichia coli K12 MG1655 proteome database and a homologous protein database. The number of identified spectra with the homologous database was about half of that with the Escherichia coli K12 MG1655 database. We also tested TopPIC on a top-down mass spectrometry data set of human MCF-7 cells and obtained similar results.ConclusionsExperimental results demonstrated that TopPIC is capable of identifying many proteoform spectrum matches and localizing unknown alterations using homologous protein sequences containing no more than 2 mutations.

Highlights

  • Top-down mass spectrometry has unique advantages in identifying proteoforms with multiple post-translational modifications and/or unknown alterations

  • We present a method for proteoform identification by top-down mass spectrometry (MS) using homologous protein sequences when the species being studied lacks a proteome database

  • Data sets Two top-down Tandem mass spectrometry (MS/MS) data sets were used to evaluate the performance of TopPIC and how mutations in database protein sequences affect the sensitivity and accuracy of proteoform identification: the first was from Escherichia coli (EC) and the second was from MCF-7 cells

Read more

Summary

Introduction

Top-down mass spectrometry has unique advantages in identifying proteoforms with multiple post-translational modifications and/or unknown alterations. When the species studied in a mass spectrometry experiment lacks its proteome sequence database, a homologous protein sequence database can be used for proteoform identification. In the past two decades, the dominant technology in proteomics studies is bottom-up MS, in which long proteins are proteolytically digested in sample preparation, Database search is routinely used for spectral identification by top-down tandem mass spectrometry (MS/MS). In this approach, experimental MS/MS spectra are searched against theoretical spectra generated from database protein sequences to find high scoring proteoform spectrum matches (PrSMs). A top-down MS/MS spectrum is elusive to identify by database search if the proteoform that produced it contains many alterations compared with the database sequence

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call