Abstract
A small phylogenetically conserved sequence of 11,231 bp, termed FAM247, is repeated in human chromosome 22 by segmental duplications. This sequence forms part of diverse genes that span evolutionary time, the protein genes being the earliest as they are present in zebrafish and/or mice genomes, and the long noncoding RNA genes and pseudogenes the most recent as they appear to be present only in the human genome. We propose that the conserved sequence provides a nucleation site for new gene development at evolutionarily conserved chromosomal loci where the FAM247 sequences reside. The FAM247 sequence also carries information in its open reading frames that provides protein exon amino acid sequences; one exon plays an integral role in immune system regulation, specifically, the function of ubiquitin-specific protease (USP18) in the regulation of interferon. An analysis of this multifaceted sequence and the genesis of genes that contain it is presented.
Highlights
The genesis of genes has been a major topic of interest for several decades [1,2]
This is considered one of the major processes in protein gene development, but it has been shown that there is a prevalence of gene birth from noncoding DNA via de novo processes [4,5,6,7]; this pathway significantly contributes to new protein gene formation [4,7]
Working with yeast Saccharomyces cerevisiae genomic segments, Carvunis et al [4] formulated an evolutionary model for the de novo development of protein genes in genetic regions where there are no annotated genes but where there is the translation of small open reading frames
Summary
The genesis of genes has been a major topic of interest for several decades [1,2]. One mechanism of gene formation is by duplication of existing genes [1,3]. With respect to long noncoding RNA (lncRNA) genes, Ulitsky and Bartel [8] have provided a comprehensive background on lncRNA transcripts and genes that includes a discussion of mechanisms of lncRNA gene origins where some gene birth processes may be similar to those that operate in protein gene formations In this treatise, we analyze the development of long intergenic noncoding RNA (lincRNA) genes and pseudogenes by an evolutionarily conserved ancestral sequence. We previously proposed that FAM247with carries form a nucleation for gene development [9] This is best exemplified the information formation oftopseudogenes by thesite addition of development [9]. Added fromthe otherpseudogenes unrelated genomic to form the final gene sequence Both the pseudogenes andsequence the FAM247A-D appear to be human-specific.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.