Abstract

FAM230C, a long intergenic non-coding RNA (lincRNA) gene in human chromosome 13 (chr13) is a member of lincRNA genes termed family with sequence similarity 230. An analysis using bioinformatics search tools and alignment programs was undertaken to determine properties of FAM230C and its related genes. Results reveal that the DNA translocation element, the Translocation Breakpoint Type A (TBTA) sequence, which consists of satellite DNA, Alu elements, and AT-rich sequences is embedded in the FAM230C gene. Eight lincRNA genes related to FAM230C also carry the TBTA sequences. These genes were formed from a large segment of the 3’ half of the FAM230C sequence duplicated in chr22, and are specifically in regions of low copy repeats (LCR22)s, in or close to the 22q.11.2 region. 22q11.2 is a chromosomal segment that undergoes a high rate of DNA translocation and is prone to genetic deletions. FAM230C-related genes present in other chromosomes do not carry the TBTA motif and were formed from the 5’ half region of the FAM230C sequence. These findings identify a high specificity in lincRNA gene formation by gene sequence duplication in different chromosomes.

Highlights

  • Long non-coding RNA genes make up a major portion of the human genome [1] and tens of thousands of lncRNA transcripts have been detected [2, 3]

  • We analyzed a family of lncRNA genes related to the long intergenic non-coding RNA gene FAM230C and address gene composition and origins

  • Central to variability and expansion of AT-rich sequences in chromosome 22 is the repeat element Translocation Breakpoint Type A, NCBI GenBank accession: AB261997.1 [13, 16]. This element carries a complex combination of diverse motifs: two partial copies of a satellite Human satellite I sequences (HSATI) sequence, two copies of a fragment of an Alu sequence similar to subspecies AluYm, two redundant AT-rich sequences termed 1 and 2, and a palindromic AT-rich repeat, the palindromic translocation breakpoint hot spot sequence (PATRR) (Fig 1)

Read more

Summary

Introduction

Long non-coding RNA (lncRNA) genes make up a major portion of the human genome [1] and tens of thousands of lncRNA transcripts have been detected [2, 3]. There has been a major effort to characterize and understand the origin and historical lineage of this genetic information. Characterizing this large amount of genes and transcripts is daunting, but significant progress has been made (see references [4,5,6,7] for a partial list). We analyzed a family of lncRNA genes related to the long intergenic non-coding RNA (lincRNA) gene FAM230C and address gene composition and origins.

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call