Amino acid sequence assignment from single molecule peptide sequencing data using a two-stage classifier.

Matthew Beauregard Smith,Zack Booth Simpson,Edward M Marcotte

doi:10.1371/journal.pcbi.1011157

Matthew Beauregard Smith, Zack Booth Simpson + Show 1 more

Open Access

https://doi.org/10.1371/journal.pcbi.1011157

Copy DOI

Journal: PLOS Computational Biology	Publication Date: May 30, 2023
Citations: 5	License type: CC BY 4.0

Affiliation: The University of Texas at Austin

Abstract

We present a machine learning-based interpretive framework (whatprot) for analyzing single molecule protein sequencing data produced by fluorosequencing, a recently developed proteomics technology that determines sparse amino acid sequences for many individual peptide molecules in a highly parallelized fashion. Whatprot uses Hidden Markov Models (HMMs) to represent the states of each peptide undergoing the various chemical processes during fluorosequencing, and applies these in a Bayesian classifier, in combination with pre-filtering by a k-Nearest Neighbors (kNN) classifier trained on large volumes of simulated fluorosequencing data. We have found that by combining the HMM based Bayesian classifier with the kNN pre-filter, we are able to retain the benefits of both, achieving both tractable runtimes and acceptable precision and recall for identifying peptides and their parent proteins from complex mixtures, outperforming the capabilities of either classifier on its own. Whatprot's hybrid kNN-HMM approach enables the efficient interpretation of fluorosequencing data using a full proteome reference database and should now also enable improved sequencing error rate estimates.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Amino acid sequence assignment from single molecule peptide sequencing data using a two-stage classifier.

Abstract

Talk to us

Similar Papers

More From: PLOS Computational Biology

Lead the way for us

Similar Papers

A theoretical analysis of single molecule protein sequencing via weak binding spectra.
Samuel G Rodriques ... Andrew C Gill
PLOS ONE | VOL. 14
Samuel G Rodriques, et. al.Samuel G Rodriques ... Andrew C Gill
28 Mar 2019
PLOS ONE | VOL. 14

Author response: A method for low-coverage single-gamete sequence analysis demonstrates adherence to Mendel’s first law across a large sample of human sperm
Kathryn J Weaver ... Avery Davis Bell
-
Kathryn J Weaver, et. al.Kathryn J Weaver ... Avery Davis Bell
05 May 2022
05 May 2022

Decision letter: A method for low-coverage single-gamete sequence analysis demonstrates adherence to Mendel’s first law across a large sample of human sperm
Molly Przeworski
-
Molly PrzeworskiMolly Przeworski
19 Apr 2022
19 Apr 2022

Editor's evaluation: A method for low-coverage single-gamete sequence analysis demonstrates adherence to Mendel’s first law across a large sample of human sperm
Daniel R Matute
-
Daniel R MatuteDaniel R Matute
19 Apr 2022
19 Apr 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Amino acid sequence assignment from single molecule peptide sequencing data using a two-stage classifier.

Abstract

Talk to us

Similar Papers

More From: PLOS Computational Biology