Abstract

Protein identification and proteome mapping mostly rely on the combination of tandem mass spectrometry and sequence database searching. Despite constant improvements achieved in instrumentation, search algorithms, and genome annotations, little effort has been invested in estimating the impact of different genome annotation releases on the final results of a proteome study. We have used a large dataset of mass spectra obtained using an Orbitrap LTQ XL instrument, covering different growth situations of the model species Chlamydomonas reinhardtii. More than one million spectra were analyzed employing the SEQUEST algorithm and four different databases corresponding to the major Chlamydomonas genome assemblies. In total more than 3000 proteins and about 11,000 peptides were identified. 238 proteins were exclusively detected in assembly 3.0 in contrast to 1222 missing proteins only detectable in other databases. The comparison of the results demonstrates that the database selection affects not only the number of identified proteins but also label free quantitation and the biological interpretation of the results. Lists of protein accessions exclusively assigned to individual C. reinhardtii genome assemblies and annotations are provided as a resource for proteogenomic studies.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call