Endogenous peptides are an abundant and versatile class of biomolecules with vital roles pertinent to the functionality of the nervous, endocrine, and immune systems and others. Mass spectrometry stands as a premier technique for identifying endogenous peptides, yet the field still faces challenges due to the lack of optimized computational resources for reliable raw mass spectra analysis and interpretation. Current database searching programs can exhibit discrepancies due to the unique properties of endogenous peptides, which typically require specialized search considerations. Herein, we present a high throughput, novel scoring algorithm for the extraction and ranking of conserved amino acid sequence motifs within any endogenous peptide database. Motifs are conserved patterns across organisms, representing sequence moieties crucial for biological functions, including maintenance of homeostasis. MotifQuest, our novel motif database generation algorithm, is designed to work in partnership with EndoGenius, a program optimized for database searching of endogenous peptides and that is powered by a motif database to capitalize on biological context to produce identifications. MotifQuest aims to quickly develop motif databases without any prior knowledge, a laborious task not possible with traditional sequence alignment resources. In this work we illustrate the utility of MotifQuest to expand EndoGenius' identification utility to other endogenous peptides by showcasing its ability to identify antimicrobial peptides. Additionally, we discuss the potential utility of MotifQuest to parse out motifs from a FASTA database file that can be further validated as new peptide drug candidates.
Read full abstract