Abstract
Pseudogenes are dead copies of genes. Owing to the absence of functional constraint, all nucleotide substitutions that occur in these sequences are selectively neutral, and thus represent the spontaneous pattern of substitution within a genome. Here, we analysed the patterns of nucleotide substitutions in Vitis vinifera processed pseudogenes. In total, 259 processed pseudogenes were used to compile two datasets of nucleotide substitutions. The ancestral states of polymorphic sites were determined based on either parsimony or site functional constraints. An overall tendency towards an increase in the pseudogene A:T content was suggested by all of the datasets analysed. Low association was seen between the patterns and rates of substitutions, and the compositional background of the region where the pseudogene was inserted. The flanking nucleotide significantly influenced the substitution rates. In particular, we noted that the transition of G→A was influenced by the presence of C at the contiguous 5′ end base. This finding is in agreement with the targeting of cytosine to methylation, and the consequent methyl-cytosine deamination. These data will be useful to interpret the roles of selection in shaping the genetic diversity of grape cultivars.
Highlights
Sequence diversity is generated by mutations that are transmitted across generations due to evolutionary forces
The extra nucleotides are important for the separation of duplicated from processed pseudogenes, as they allow the queries to align to the pseudo-intron–exon boundaries of duplicated pseudogenes
As processed pseudogenes are by generated genomic of integration of a that is retro-transcribed from a spliced transcript, they are believed to be intronless
Summary
Sequence diversity is generated by mutations that are transmitted across generations due to evolutionary forces. The availability of sequence polymorphism data for a high number of individuals of one species can provide a wealth of information on the spectra and dynamics of nucleotide substitutions. Most of these studies are conducted without any assessment of the selective constraints on the sites analysed, and it can be difficult to deduce how an identified substitution spectrum deviates from expectations constructed under the assumption of neutrality. Datasets of nucleotide polymorphisms generated by next-generation sequencing usually do not report the identity of the neighbouring (unchanged) nucleotides, and information on the nucleotide context in which a mutation has occurred is not readily accessible
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.