Evaluation of bottom-up and top-down mass spectrum identifications with different customized protein sequences databases

Ziwei Li,Weixing Feng,Bo He

doi:10.1093/bioinformatics/btz733

Ziwei Li, Weixing Feng + Show 1 more

Open Access

PDF Available

https://doi.org/10.1093/bioinformatics/btz733

Copy DOI

Export

Save

Cite

Journal: Bioinformatics	Publication Date: Oct 4, 2019
Citations: 5

Affiliation: Harbin Engineering University

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

Generally, bottom-up and top-down are two complementary approaches for proteoforms identification. The inference of proteoforms relies on searching mass spectra against an accurate proteoform sequence database. A customized protein sequence database derived by RNA-Seq data can be used to better identify the proteoform existed in a studied species. However, the quality of sequences in customized databases which constructed by different strategies affect the performances of mass spectrometry (MS) identification. Additionally, performances of identifications between bottom-up and top-down using customized databases are also needed to be evaluated. Three customized databases were constructed with different strategies separately. Two of them were based on translating assembled transcripts with or without genomic annotation, and the third one is a variant-extending protein database. By testing with bottom-up and top-down MS data separately, a variant-extending protein database could identify not only the most number of spectra but also the alleles expressed at the same time in diploid cells. An assembled database could identify the spectrum missed in reference database and amino acid (AA) alterations existed in studied species. Experimental results demonstrated that the proteoform sequences in an annotated database are more suitable for identifying AA alterations and peptide sequences missed in reference database. An unannotated database instead of a reference proteome database gets an enough high sensitivity of identifying mass spectra. The variant-extending reference database is the most sensitive to identify mass spectra and single AA variants. Supplementary data are available at Bioinformatics online.

Full Text