Abstract

Characteristics of human and mouse orthologous gene sequences which have large G+C content variations were investigated in this study. The orthologous gene pairs were classified into two groups according to the deviation between human and mouse G+C content at the third codon position (GC3) and were subsequently analyzed. In one group, mouse genes had higher GC3 than the corresponding human genes and in another group, human genes had higher GC3 than mouse. Furthermore, the orthologous pairs were separated based on the deviation between human or mouse GC3 and the G+C content at the third codon position of identical codons (IC3), to examine the effect of increased or decreased G+C content in human or mouse sequences. The nucleotide substitution patterns between human and mouse sequences in the two groups were remarkably distinct, and consistent with the state of G+C-rich or G+C-poor sequences. The effect of increase or decrease of G+C content in human or mouse sequences was not clear in the nucleotide substitution patterns. The chromosomal locations of human and mouse orthologous gene pairs were different between the two groups. The genes located on an identical syntenic segment showed the trend of having similar G+C content. Moreover, the same gene order of some genes on different chromosomes of both species demonstrated the gene rearrangements between human and mouse. Our study indicated that the chromosomal locations and rearrangements are associated with the GC3 variation between human and mouse sequences.

Highlights

  • Mammalian genomes consist of long DNA stretches with varying G+C content known as isochores [1]

  • Due to the definition of classification, for example, there are two possible cases in group 1.1; (1) mouse GC3 is higher than IC3 and human GC3 is lower than IC3. (2) both mouse GC3 and human GC3 are higher than IC3

  • Both in groups 1.1 and 1.2, most of orthologous pairs showed that mouse GC3 is higher than IC3 and human GC3 is lower than IC3

Read more

Summary

Introduction

Mammalian genomes consist of long DNA stretches (over hundreds of kilobases) with varying G+C content known as isochores [1]. Studies have indicated that several homologous mammalian genes occupying different chromosomal positions differ considerably in their base composition and codon usage [1,2,9]. Genes with a high G+C content at the third codon position, or high GC3 >80% G+C), are almost always surrounded by long G+C-rich (e.g. 55-65% G+C) genomic sequences, while those with a low GC3 Many studies of the G+C content variations among mammals have been reported [12,13,14,15,16]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call