Abstract
Over the last few years, there has been an increasing amount of evidence for the de novo emergence of protein-coding genes, i.e. out of non-coding DNA. Here, we review the current literature and summarize the state of the field. We focus specifically on open questions and challenges in the study of de novo protein-coding genes such as the identification and verification of de novo-emerged genes. The greatest obstacle to date is the lack of high-quality genomic data with very short divergence times which could help precisely pin down the location of origin of a de novo gene. We conclude that, while there is plenty of evidence from a genetics perspective, there is a lack of functional studies of bona fide de novo genes and almost no knowledge about protein structures and how they come about during the emergence of de novo protein-coding genes. We suggest that future studies should concentrate on the functional and structural characterization of de novo protein-coding genes as well as the detailed study of the emergence of functional de novo protein-coding genes.
Highlights
The question of how new genes come about has been a major research theme in evolutionary biology since the discovery that different species’ genomes contain varying numbers of genes
In recent years, an increasing number of studies confirmed a major role of de novo gene emergence in the evolution of new proteincoding genes
The functional description of de novo-emerged genes is still lacking, but more general findings for orphan genes suggest that novel genes have a broad functional potential
Summary
The question of how new genes come about has been a major research theme in evolutionary biology since the discovery that different species’ genomes contain varying numbers of genes. If most confirmed de novo genes are folding, but most intergenic ORFs do not possess folding potential, folding potential would be a bottleneck of de novo protein-coding gene emergence and retention Another unsolved problem is how to find specific annotation thresholds for orphans/de novo genes[4]. Recent research has already shown that small ORFs (smORFs) can play a functional role[62,63], and it seems quite likely that very short novel ORFs could be functional This question touches upon the problem of differentiating lncRNAs from protein-coding genes, which is often performed via an ORF length cutoff[17,32]. Two closely related questions are how and when de novo proteins gain their function: are de novo genes usually functional from the time point of their emergence, or do they gain a cellular task only after a period of drift?
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.