Abstract

Peptide and protein identification remains challenging in organisms with poorly annotated or rapidly evolving genomes, as are commonly encountered in environmental or biofuels research. Such limitations render tandem mass spectrometry (MS/MS) database search algorithms ineffective as they lack corresponding sequences required for peptide-spectrum matching. We address this challenge with the spectral networks approach to (1) match spectra of orthologous peptides across multiple related species and then (2) propagate peptide annotations from identified to unidentified spectra. We here present algorithms to assess the statistical significance of spectral alignments (Align-GF), reduce the impurity in spectral networks, and accurately estimate the error rate in propagated identifications. Analyzing three related Cyanothece species, a model organism for biohydrogen production, spectral networks identified peptides from highly divergent sequences from networks with dozens of variant peptides, including thousands of peptides in species lacking a sequenced genome. Our analysis further detected the presence of many novel putative peptides even in genomically characterized species, thus suggesting the possibility of gaps in our understanding of their proteomic and genomic expression. A web-based pipeline for spectral networks analysis is available at http://proteomics.ucsd.edu/software.

Highlights

  • Microorganisms have evolved their cellular metabolism to generate energy for life in unusual environments [1], and their

  • We show that spectral networks can improve peptide identification by up to 38% compared with mainstream approaches, including many polymorphic and modified peptides

  • As is generally the case with organisms with poorly annotated genomes, proteomics analysis for various environmental isolates is challenging because of the lack of accurate and complete proteomes. We show how this proteomics challenge can be addressed for Cyanothece 51472 by using spectral networks of MS/MS data of related species (Cyanothece 8801 and Cyanothece 51142)

Read more

Summary

Technological Resources and Innovation

Multi-species Identification of Polymorphic Peptide Variants via Propagation in Spectral Networks*□S. Such limitations render tandem mass spectrometry (MS/MS) database search algorithms ineffective as they lack corresponding sequences required for peptide-spectrum matching We address this challenge with the spectral networks approach to [1] match spectra of orthologous peptides across multiple related species and [2] propagate peptide annotations from identified to unidentified spectra. This approach does not require advance knowledge of the genomes for all species, and enables the identification of novel, polymorphic peptides across species via interspecies propagation. The diversity of biologically important protein families could be studied by comparing closely and more remotely related species

EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.