Abstract

BackgroundAlthough technical advances in genomics and proteomics research have yielded a better understanding of the coding capacity of a genome, one major challenge remaining is the identification of all expressed proteins, especially those less than 100 amino acids in length. Such information can be particularly relevant to human pathogens, such as Trypanosoma brucei, the causative agent of African trypanosomiasis, since it will provide further insight into the parasite biology and life cycle.ResultsStarting with 993 T. brucei transcripts, previously shown by RNA-Sequencing not to coincide with annotated coding sequences (CDS), homology searches revealed that 173 predicted short open reading frames in these transcripts are conserved across kinetoplastids with 13 also conserved in representative eukaryotes. Mining mass spectrometry data sets revealed 42 transcripts encoding at least one matching peptide. RNAi-induced down-regulation of these 42 transcripts revealed seven to be essential in insect-form trypanosomes with two also required for the bloodstream life cycle stage. To validate the specificity of the RNAi results, each lethal phenotype was rescued by co-expressing an RNAi-resistant construct of each corresponding CDS. These previously non-annotated essential small proteins localized to a variety of cell compartments, including the cell surface, mitochondria, nucleus and cytoplasm, inferring the diverse biological roles they are likely to play in T. brucei. We also provide evidence that one of these small proteins is required for replicating the kinetoplast (mitochondrial) DNA.ConclusionsOur studies highlight the presence and significance of small proteins in a protist and expose potential new targets to block the survival of trypanosomes in the insect vector and/or the mammalian host.

Highlights

  • Technical advances in genomics and proteomics research have yielded a better understanding of the coding capacity of a genome, one major challenge remaining is the identification of all expressed proteins, especially those less than 100 amino acids in length

  • T. brucei transcripts encoding evolutionarily conserved potential small proteins We previously published a single-nucleotide resolution genomic map of the T. brucei transcriptome, which included 1,114 transcripts not originating from annotated coding sequences (CDS) ([28]; original RNA-Seq data have been submitted to the National Center for Biotechnology Information (NCBI) Sequence Read Archive - SRA at [32] - under accession no

  • After a reexamination of this data set using the latest T. brucei genome annotation (GeneDB version 5, [34]), we excluded 39 and 10 transcripts coding for small nucleolar RNA (snoRNA) and annotated proteins larger than 300 amino acids, respectively, and added two novel transcripts coding for proteins identified by mass spectrometry (MS) data (Figure 1)

Read more

Summary

Introduction

Technical advances in genomics and proteomics research have yielded a better understanding of the coding capacity of a genome, one major challenge remaining is the identification of all expressed proteins, especially those less than 100 amino acids in length. Such information can be relevant to human pathogens, such as Trypanosoma brucei, the causative agent of African trypanosomiasis, since it will provide further insight into the parasite biology and life cycle. A report on the mammalian small proteome by Frith et al used FANTOM cDNA data to identify a potential 1,240 sORFs using a CRITICA gene-detection program [17]. 140 small proteins were tested by generating gene deletions and 22 had an effect on Saccharomyces cerevisiae growth under various conditions [19], whereas overexpression of 473 small proteins in Arabidopsis resulted in 49 recognizable phenotypes [20]

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.