Abstract

Accurate genome annotation, the foundation of life science research in the genome era, is hampered by limited known gene models, nonstandard start codons, and the limited homology of annotated genes in other organisms. LysargiNase mirrors trypsin at the cleavage sites, providing the opportunity to identify peptides other than tryptic peptides. In this study, we used an in-house developed acetylated LysargiNase (Ac-LysargiNase) with higher activity and stability in non-pathogenic Mycolicibacterium smegmatis MC2 155 to supplement the widely used trypsin in proteomic studies. We identified 27,582 peptides from 3844 annotated proteins and 332 novel genome search-specific peptides (GSSPs). Among these GSSPs, 88 peptides were annotated in another M.smegmatis genome database, and 41 were verified as novel peptides by predicted theoretical spectra and their corresponding 15N-labeling spectra. Further analysis revealed that 17 verified GSSPs corrected the N-terminus of the 13 annotated genes. The other 24 verified GSSPs helped identify 17 novel open reading frames (ORFs) missed in previously annotated M. smegmatis genomes. Among these novel ORFs, four relatively small proteins with amino acid residues less than 100 and three were precisely identified with C-terminal peptides. Ac-LysargiNase helps with genome reannotation by identifying new genes and events in proteogenomic studies. SignificanceCorrect genomic annotation is vital in the field of life sciences. The nonstandard start codons seriously affect the confirmation of the translation initiation sites (TISs) of an open reading frame (ORF), and unknown structural genes are easily missed in automated gene prediction. Although proteogenomics presents new avenues for validating gene expression and gene structure refinement based on conventional tryptic peptides, determining the TISs and potential encoding genes is complicated. Thus, validation of TISs and encoding ORFs is crucial and urgent. Therefore, we recommend Ac-LysargiNase, a mirror enzyme of trypsin that can identify additional novel peptides for N-terminal correction and ORF identification.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.