Abstract

Pathways leading to formation of non-coding RNA and protein genes are varied and complex. We report finding a conserved repeat sequence present in human and chimpanzee genomes that appears to have originated from a common primate ancestor. This sequence is repeatedly copied in human chromosome 22 (chr22) low copy repeats (LCR22) or segmental duplications and forms twenty-one different genes, which include the human long intergenic non-coding RNA (lincRNA) family FAM230, a newly discovered lincRNA gene family termed conserved long intergenic non-coding RNAs (clincRNA), pseudogene families, as well as the gamma-glutamyltransferase (GGT) protein gene family and the RNA pseudogenes that originate from GGT sequences. Of particular interest are the GGT5 and USP18 protein genes that appear to have formed from an homologous repeat sequence that also forms the clincRNA gene family. The data point to ancestral DNA sequences, conserved through evolution and duplicated in humans by chromosomal repeat sequences that may serve as functional genomic elements in the development of diverse genes.

Highlights

  • Models presented for the pathways in formation of genes are diverse [1]. These include formation of long non-coding RNA genes from protein genes [2,3,4,5], with one study based on similarities in open reading frames [5], and the reverse pathway of human protein gene formation from lncRNA genes that are found in rhesus macaque and chimpanzee [6]

  • The DNA repeat sequence was detected in human chr22 segmental duplications LCR22A and LCR22D while analyzing the FAM230 long intergenic non-coding RNA (lincRNA) family genes [19]

  • The proposed ancestral proto-gene forming element is based on the findings that GGT, ubiquitin specific peptidase (USP), and the three distinct linked DNA sequences, FAM230, clincRNA and spacer are conserved between humans, chimpanzee and other primates, and that different genes have formed from these sequences

Read more

Summary

Introduction

Models presented for the pathways in formation of genes are diverse [1] These include formation of long non-coding RNA (lncRNA) genes from protein genes [2,3,4,5], with one study based on similarities in open reading frames [5], and the reverse pathway of human protein gene formation from lncRNA genes that are found in rhesus macaque and chimpanzee [6]. Chr has the largest number of segmental duplications per unit chromosomal length of any human chromosome [7]. These duplications are dynamic [8]. Several may have arisen after the separation of human and macaque lineages [9] and they are continuously evolving, as shown

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call