Accurately Assigning Peptides to Spectra When Only a Subset of Peptides Are Relevant.

Andy Lin,Deanna L Plubell,William S Noble,Uri Keich

doi:10.1021/acs.jproteome.1c00483

Andy Lin, Deanna L Plubell + Show 2 more

Open Access

https://doi.org/10.1021/acs.jproteome.1c00483

Copy DOI

Abstract

The standard proteomics database search strategy involves searching spectra against a peptide database and estimating the false discovery rate (FDR) of the resulting set of peptide-spectrum matches. One assumption of this protocol is that all the peptides in the database are relevant to the hypothesis being investigated. However, in settings where researchers are interested in a subset of peptides, alternative search and FDR control strategies are needed. Recently, two methods were proposed to address this problem: subset-search and all-sub. We show that both methods fail to control the FDR. For subset-search, this failure is due to the presence of "neighbor" peptides, which are defined as irrelevant peptides with a similar precursor mass and fragmentation spectrum as a relevant peptide. Not considering neighbors compromises the FDR estimate because a spectrum generated by an irrelevant peptide can incorrectly match well to a relevant peptide. Therefore, we have developed a new method, "subset-neighbor search" (SNS), that accounts for neighbor peptides. We show evidence that SNS controls the FDR when neighbors are present and that SNS outperforms group-FDR, the only other method that appears to control the FDR relative to a subset of relevant peptides.

Full Text