The impact of noise and missing fragmentation cleavages on de novo peptide identification algorithms

Kevin Mcdonnell,Enda Howley,Florence Abram

doi:10.1016/j.csbj.2022.03.008

Abstract

Proteomics aims to characterise system-wide protein expression and typically relies on mass-spectrometry and peptide fragmentation, followed by a database search for protein identification. It has wide ranging applications from clinical to environmental settings and virtually impacts on every area of biology. In that context, de novo peptide sequencing is becoming increasingly popular. Historically its performance lagged behind database search methods but with the integration of machine learning, this field of research is gaining momentum. To enable de novo peptide sequencing to realise its full potential, it is critical to explore the mass spectrometry data underpinning peptide identification. In this research we investigate the characteristics of tandem mass spectra using 8 published datasets. We then evaluate two state of the art de novo peptide sequencing algorithms, Novor and DeepNovo, with a particular focus on their performance with regard to missing fragmentation cleavage sites and noise. DeepNovo was found to perform better than Novor overall. However, Novor recalled more correct amino acids when 6 or more cleavage sites were missing. Furthermore, less than 11% of each algorithms’ correct peptide predictions emanate from data with more than one missing cleavage site, highlighting the issues missing cleavages pose. We further investigate how the algorithms manage to correctly identify peptides with many of these missing fragmentation cleavages. We show how noise negatively impacts the performance of both algorithms, when high intensity peaks are considered. Finally, we provide recommendations regarding further algorithms’ improvements and offer potential avenues to overcome current inherent data limitations.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Computational and Structural Biotechnology Journal	Publication Date: Jan 1, 2022
Citations: 9	License type: cc-by-nc-nd

R Discovery Prime

R Discovery Prime

The impact of noise and missing fragmentation cleavages on de novo peptide identification algorithms

Abstract

Talk to us

Similar Papers

More From: Computational and Structural Biotechnology Journal

Lead the way for us

Similar Papers

Peptide Identification by Database Search of Mixture Tandem Mass Spectra
Jian Wang ... Philip E Bourne
Molecular & Cellular Proteomics | VOL. 10
Jian Wang, et. al.Jian Wang ... Philip E Bourne
23 Aug 2011
Molecular & Cellular Proteomics | VOL. 10

Protease cleavage site fingerprinting by label-free in-gel degradomics reveals pH-dependent specificity switch of legumain.
Robert Vidmar ... Matej Vizovišek
The EMBO Journal | VOL. 36
Robert Vidmar, et. al.Robert Vidmar ... Matej Vizovišek
21 Jul 2017
The EMBO Journal | VOL. 36

Enhanced Peptide Identification by Electron Transfer Dissociation Using an Improved Mascot Percolator
James C Wright ... Jyoti S Choudhary
Molecular & Cellular Proteomics | VOL. 11
James C Wright, et. al.James C Wright ... Jyoti S Choudhary
01 Aug 2012
Molecular & Cellular Proteomics | VOL. 11

Combining Results of Multiple Search Engines in Proteomics
David Shteynberg ... Eric W Deutsch
Molecular & Cellular Proteomics | VOL. 12
David Shteynberg, et. al.David Shteynberg ... Eric W Deutsch
01 Sep 2013
Molecular & Cellular Proteomics | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The impact of noise and missing fragmentation cleavages on de novo peptide identification algorithms

Abstract

Talk to us

Similar Papers

More From: Computational and Structural Biotechnology Journal