The near-complete genome assembly of allotetraploid Pennisetum purpureum ‘Purple’ reveals the genetic and epigenetic landscape of centromeres
Abstract Drastic karyotype changes are a major evolutionary force, potentially involving centromere position, number, distribution, or strength alterations. Yet, the genetic and epigenetic landscape of centromeres, especially in allopolyploid plants during subgenome reshuffling, remains poorly understood. Here, we present a near-complete chromosome-scale genome assembly of the allotetraploid Pennisetum purpureum ‘Purple’, resolving all 14 centromeres. We find that subgenome-biased expansion of six LTR retrotransposons (LTR-RTs) drives architectural divergence between subgenomes. Centromeric satellite repeats (CentPs) show rapid sequence divergence across subgenomes and chromosomes, with CENH3 preferentially binding conserved higher-order repeats (HORs). Intriguingly, centromeric retrotransposons in Pennisetum (CRPs) are evolutionarily younger compared to their non-centromeric counterparts, coupled with marked subgenome B-biased amplification. Notably, CRP insertions flanking CentP satellites correlate with elevated satellite DNA polymorphism, supporting a model wherein CentP homogenization processes actively purge retrotransposons from centromeric arrays. Despite rapid sequence diversification of centromeric repeats, the epigenetic landscapes remain evolutionarily conserved in the centromeres of two subgenomes. Additionally, comparative analyses across Pennisetum species demonstrate rapid species- and chromosome-level turnover of CentPs and CRPs. Overall, our study illuminates the genetic and epigenetic plasticity of centromeres in allopolyploids, revealing how centromeric repeats adapt post-subgenome reshuffling.
- Research Article
41
- 10.1111/tpj.13309
- Nov 1, 2016
- The Plant Journal
Centromeric chromatin in most eukaryotes is composed of highly repetitive centromeric retrotransposons and satellite repeats that are highly variable even among closely related species. The evolutionary mechanisms that underlie the rapid evolution of centromeric repeats remain unknown. To obtain insight into the evolution of centromeric repeats following polyploidy, we studied a model diploid progenitor (Gossypium raimondii, D-genome) of the allopolyploid (AD-genome) cottons, G.hirsutum and G.barbadense. Sequence analysis of chromatin-immunoprecipitated DNA showed that the G.raimondii centromeric repeats originated from retrotransposon-related sequences. Comparative analysis showed that nine of the 10 analyzed centromeric repeats were absent from the centromeres in the A-genome and related diploid species (B-, F- and G-genomes), indicating that they colonized the centromeres of D-genome lineage after the divergence of the A- and D- ancestral species or that they were ancestrally retained prior to the origin of Gossypium. Notably, six of the nine repeats were present in both the A- and D-subgenomes in tetraploid G.hirsutum, and increased in abundance in both subgenomes. This finding suggests that centromeric repeats may spread and proliferate between genomes subsequent to polyploidization. Two repeats, Gr334 and Gr359 occurred in both the centromeres and nucleolar organizer regions (NORs) in D- and AD-genome species, yet localized to just the NORs in A-, B-, F-, and G-genome species. Contained within is a story of an established centromeric repeat that is eliminated and allopolyploidization provides an opportunity for reinvasion and reestablishment, which broadens our evolutionary understanding behind the cycles of centromeric repeat establishment and targeting.
- Research Article
183
- 10.1371/journal.pgen.0010079
- Dec 1, 2005
- PLoS Genetics
Centromeres interact with the spindle apparatus to enable chromosome disjunction and typically contain thousands of tandemly arranged satellite repeats interspersed with retrotransposons. While their role has been obscure, centromeric repeats are epigenetically modified and centromere specification has a strong epigenetic component. In the yeast Schizosaccharomyces pombe, long heterochromatic repeats are transcribed and contribute to centromere function via RNA interference (RNAi). In the higher plant Arabidopsis thaliana, as in mammalian cells, centromeric satellite repeats are short (180 base pairs), are found in thousands of tandem copies, and are methylated. We have found transcripts from both strands of canonical, bulk Arabidopsis repeats. At least one subfamily of 180–base pair repeats is transcribed from only one strand and regulated by RNAi and histone modification. A second subfamily of repeats is also silenced, but silencing is lost on both strands in mutants in the CpG DNA methyltransferase MET1, the histone deacetylase HDA6/SIL1, or the chromatin remodeling ATPase DDM1. This regulation is due to transcription from Athila2 retrotransposons, which integrate in both orientations relative to the repeats, and differs between strains of Arabidopsis. Silencing lost in met1 or hda6 is reestablished in backcrosses to wild-type, but silencing lost in RNAi mutants and ddm1 is not. Twenty-four–nucleotide small interfering RNAs from centromeric repeats are retained in met1 and hda6, but not in ddm1, and may have a role in this epigenetic inheritance. Histone H3 lysine-9 dimethylation is associated with both classes of repeats. We propose roles for transcribed repeats in the epigenetic inheritance and evolution of centromeres.
- Research Article
- 10.1186/s13059-025-03924-9
- Jan 5, 2026
- Genome biology
Centromere function is fundamental and conserved across eukaryotes, despite highly divergent DNA sequences, even among closely related species. These regions often contain rapidly evolving repeats and retrotransposons, yet play a crucial role in chromosome segregation. Soybean, which harbors two distinct types of centromeric satellite repeats, is an ideal model for studying centromeric repeat organization and function. Here we generate the complete map of centromeric satellite repeats revealing the organizational patterns of different types of centromeric satellite repeats within centromeres. These maps are constructed using three recently available telomere-to-telomere soybean genomes. We find that certain centromeric satellite repeats exhibit chromosome-specific evolutionary trajectories and may serve distinct functional roles in centromere activity. We further analyze the potential relationship between centromere-specific histones H3 (CENH3) and centromeric satellite repeats, identifying consensus motifs associated with CENH3-binding sites. We also analyze the higher-order tandem repeats of the centromere and propose a hypothetical model of centromeric DNA replication. We conclude that CentGm-1 and CentGm-4 evolve independently. The observation that completely identical CentGm-4 sequences consistently appear on the same chromosome across different soybean varieties indicates a stronger chromosome-specific preference for CentGm-4. We propose a model in which replication templates within the centromere region originate from multiple CENH3-nucleosome complexes bound to CentGm sequences. Both CentGm-1 and CentGm-4 contain similar motifs with the potential to bind CENH3 protein. The findings provide a new insight into the mechanisms behind centromere diversity and dynamics.
- Research Article
180
- 10.1073/pnas.0503863102
- Jul 22, 2005
- Proceedings of the National Academy of Sciences
The functional centromeres of rice (Oryza sativa, AA genome) chromosomes contain two key DNA components: the CRR centromeric retrotransposons and a 155-bp satellite repeat, CentO. However, several wild Oryza species lack the CentO repeat. We developed a chromatin immunoprecipitation-based technique to clone DNA fragments derived from chromatin containing the centromeric histone H3 variant CenH3. Chromatin immunoprecipitation cloning was carried out in the CentO-less species Oryza rhizomatis (CC genome) and Oryza brachyantha (FF genome). Three previously uncharacterized genome-specific satellite repeats, CentO-C1, CentO-C2, and CentO-F, were discovered in the centromeres of these two species. An 80-bp DNA region was found to be conserved in CentO-C1, CentO, and centromeric satellite repeats from maize and pearl millet, species which diverged from rice many millions of years ago. In contrast, the CentO-F repeat shows no sequence similarity to other centromeric repeats but has almost completely replaced other centromeric sequences in O. brachyantha, including the CRR-related sequences that normally constitute a significant fraction of the centromeric DNA in grass species.
- Research Article
44
- 10.1016/j.ygeno.2010.12.002
- Dec 13, 2010
- Genomics
Rapid divergence of repetitive DNAs in Brassica relatives
- Research Article
4
- 10.1089/aid.2022.0161
- Jun 7, 2023
- AIDS research and human retroviruses
Decades of effort have yielded highly effective antiviral agents to treat HIV, but viral strains have evolved resistance to each inhibitor type, focusing attention on the importance of developing new inhibitor classes. A particularly promising new target is the HIV capsid, the function of which can be disrupted by highly potent inhibitors that persist long term in treated subjects. Studies with such inhibitors have contributed to an evolving picture of the role of capsid itself-the inhibitors, like certain capsid protein (CA) amino acid substitutions, can disrupt intracellular trafficking to alter the selection of target sites for HIV DNA integration in cellular chromosomes. In this study, we compare effects on HIV integration targeting for two potent inhibitors-a new molecule targeting CA, GSK878, and the previously studied lenacapavir (LEN, formerly known as GS-6207). We find that both inhibitors reduce integration in active transcription units and near epigenetic marks associated with active transcription. A careful study of integration near repeated sequences indicated frequencies were also altered for integration within multiple repeat classes. One notable finding was increased integration in centromeric satellite repeats in the presence of LEN and GSK878, which is of interest because proviruses integrated in centromeric repeats have been associated with transcriptional repression, inducibility, and latency. These data add to the picture that CA protein remains associated with preintegration complexes through the point in infection during which target sites for integration are selected, and specify new aspects of the consequences of disrupting this mechanism.
- Research Article
107
- 10.1186/1471-2164-14-142
- Mar 4, 2013
- BMC Genomics
BackgroundTandem repeats are ubiquitous and abundant in higher eukaryotic genomes and constitute, along with transposable elements, much of DNA underlying centromeres and other heterochromatic domains. In maize, centromeric satellite repeat (CentC) and centromeric retrotransposons (CR), a class of Ty3/gypsy retrotransposons, are enriched at centromeres. Some satellite repeats have homology to retrotransposons and several mechanisms have been proposed to explain the expansion, contraction as well as homogenization of tandem repeats. However, the origin and evolution of tandem repeat loci remain largely unknown.ResultsCRM1TR and CRM4TR are novel tandem repeats that we show to be entirely derived from CR elements belonging to two different subfamilies, CRM1 and CRM4. Although these tandem repeats clearly originated in at least two separate events, they are derived from similar regions of their respective parent element, namely the long terminal repeat (LTR) and untranslated region (UTR). The 5′ ends of the monomer repeat units of CRM1TR and CRM4TR map to different locations within their respective LTRs, while their 3′ ends map to the same relative position within a conserved region of their UTRs. Based on the insertion times of heterologous retrotransposons that have inserted into these tandem repeats, amplification of the repeats is estimated to have begun at least ~4 (CRM1TR) and ~1 (CRM4TR) million years ago. Distinct CRM1TR sequence variants occupy the two CRM1TR loci, indicating that there is little or no movement of repeats between loci, even though they are separated by only ~1.4 Mb.ConclusionsThe discovery of two novel retrotransposon derived tandem repeats supports the conclusions from earlier studies that retrotransposons can give rise to tandem repeats in eukaryotic genomes. Analysis of monomers from two different CRM1TR loci shows that gene conversion is the major cause of sequence variation. We propose that successive intrastrand deletions generated the initial repeat structure, and gene conversions increased the size of each tandem repeat locus.
- Research Article
13
- 10.1038/s41467-022-29097-8
- Mar 18, 2022
- Nature Communications
Retroviruses utilize the viral integrase (IN) protein to integrate a DNA copy of their genome into host chromosomal DNA. HIV-1 integration sites are highly biased towards actively transcribed genes, likely mediated by binding of the IN protein to specific host factors, particularly LEDGF, located at these gene regions. We here report a substantial redirection of integration site distribution induced by a single point mutation in HIV-1 IN. Viruses carrying the K258R IN mutation exhibit a high frequency of integrations into centromeric alpha satellite repeat sequences, as assessed by deep sequencing, a more than 10-fold increase over wild-type. Quantitative PCR and in situ immunofluorescence assays confirm this bias of the K258R mutant virus for integration into centromeric DNA. Immunoprecipitation studies identify host factors binding to IN that may account for the observed bias for integration into centromeres. Centromeric integration events are known to be enriched in the latent reservoir of infected memory T cells, as well as in elite controllers who limit viral replication without intervention. The K258R point mutation in HIV-1 IN is also present in databases of latent proviruses found in patients, and may reflect an unappreciated aspect of the establishment of viral latency.
- Research Article
89
- 10.1038/nature09608
- Dec 12, 2010
- Nature
Centromere-binding protein B (CENP-B) is a widely conserved DNA binding factor associated with heterochromatin and centromeric satellite repeats1. In fission yeast, CENP-B homologs have been shown to silence Long Terminal Repeat (LTR) retrotransposons by recruiting histone deacetylases2. However, CENP-B factors also have unexplained roles in DNA replication3, 4. Here, we show that a molecular function of CENP-B is to promote replication fork progression through the LTR. Mutants have increased genomic instability caused by replication fork blockage that depends on the DNA binding factor Switch Activating Protein 1 (Sap1), which is directly recruited by the LTR. The loss of Sap1-dependent barrier activity allows the unhindered progression of the replication fork, but results in rearrangements deleterious to the retrotransposon. We conclude that retrotransposons influence replication polarity through recruitment of Sap1 and transposition near replication fork blocks, while CENP-B counteracts this activity and promotes fork stability. Our results may account for the role of LTR in fragile sites, and for the association of CENP-B with pericentromeric heterochromatin and tandem satellite repeats.
- Research Article
12
- 10.1007/s003359901048
- Jun 1, 1999
- Mammalian Genome
The centromeric region of swine chromosomes is comprised of tandemly repeated, divergent DNA monomer units. Here we report that these divergent DNA monomer sequences are organized into higher-order repeats, analogous to the hierarchical organization of alpha-satellite monomers in human centromeres. In this study, a centromeric cosmid clone was shown to be comprised entirely of a 3.3-kb higher-order repeat, with independent copies of this higher-order repeat more than 99% identical to each other. This higher-order repeat is composed of ten divergent monomer units of approximately 340 bp. The ten monomers are on average 79% identical, and all ten monomers are arranged in the same 5' to 3' orientation. In FISH analysis, a cloned 3.3-kb higher-order repeat hybridized to the centromere of Chromosome (Chr) 9 in metaphase spreads and detected two discrete foci in interphase nuclei, demonstrating that this swine higher-order repeat is chromosome-specific. The Chr 9 centromeric array spanned approximately 2.2 Mb as determined by pulsed-field gel electrophoresis. Moreover, the swine Chr 9 centromere is highly polymorphic, because an EcoRI restriction site polymorphism was detected. Thus, the assembly of divergent satellite sequences into chromosome-specific higher-order repeats appears to be a common organizational feature of both the human and swine centromere and suggests that the evolutionary mechanism(s) that create and maintain higher-order repeats is conserved between their genomes.
- Research Article
27
- 10.1534/g3.115.024984
- Feb 9, 2016
- G3: Genes|Genomes|Genetics
Fluorescence in situ hybridization (FISH)-based karyotyping is a powerful cytogenetics tool to study chromosome organization, behavior, and chromosome evolution. Here, we developed a FISH-based karyotyping system using a probe mixture comprised of centromeric and subtelomeric satellite repeats, 5S rDNA, and chromosome-specific BAC clones in common bean, which enables one to unambiguously distinguish all 11 chromosome pairs. Furthermore, we applied the karyotyping system to several wild relatives and landraces of common bean from two distinct gene pools, as well as other related Phaseolus species, to investigate repeat evolution in the genus Phaseolus. Comparison of karyotype maps within common bean indicates that chromosomal distribution of the centromeric and subtelomeric satellite repeats is stable, whereas the copy number of the repeats was variable, indicating rapid amplification/reduction of the repeats in specific genomic regions. In Phaseolus species that diverged approximately 2–4 million yr ago, copy numbers of centromeric repeats were largely reduced or diverged, and chromosomal distributions have changed, suggesting rapid evolution of centromeric repeats. We also detected variation in the distribution pattern of subtelomeric repeats in Phaseolus species. The FISH-based karyotyping system revealed that satellite repeats are actively and rapidly evolving, forming genomic features unique to individual common bean accessions and Phaseolus species.
- Research Article
62
- 10.1093/molbev/msl127
- Sep 20, 2006
- Molecular Biology and Evolution
Satellite DNA is a major component of centromeric heterochromatin in most multicellular eukaryotes, where it is typically organized into megabase-sized tandem arrays. It has recently been demonstrated that small interfering RNAs (siRNAs) processed from centromeric satellite repeats can be involved in epigenetic chromatin modifications which appear to underpin centromere function. However, the structural organization and evolution of the centromeric satellite DNA is still poorly understood. We analyzed the centromeric satellite repeat arrays from rice chromosomes 1 and 8 and identified higher order structures and local homogenization of the CentO repeats in these 2 centromeres. We also cloned the CentO repeats from the CENH3-associated nucleosomes by a chromatin immunoprecipitation (ChIP)-based method. Sequence variability analysis of the ChIPed CentO repeats revealed a single variable domain within the repeat. We detected transcripts derived from both strands of the CentO repeats. The CentO transcripts are processed into siRNA, suggesting a potential role of this satellite repeat family in epigenetic chromatin modification.
- Research Article
1
- 10.1093/hr/uhaf244
- Sep 15, 2025
- Horticulture Research
Centromeres are essential for centromere-specific histone H3 (CENH3) recruitment and kinetochore assembly, ensuring accurate chromosome segregation and maintaining genome stability in plants. Although extensively studied in model species, the structural organization of centromeres in nonmodel plants, such as fruit trees, remains poorly explored. Our previous study revealed that jujube centromeres lack the typical tandem repeat (TR)-rich structure, complicating their precise identification. In this study, we updated the genome assembly of jujube (Ziziphus jujuba Mill. ‘Dongzao’) to a haplotype-resolved T2T version, enabling accurate mapping and comparison of centromeres between haplotypes using CENH3 ChIP-seq. These centromeres, ranging from 0.75 to 1.40 Mb, are largely conserved between haplotypes, except for a localized inversion on chromosome 10. Unlike the TR-rich centromeres found in many plant species, jujube centromeres are predominantly composed of Gypsy-type long-terminal repeat retrotransposons (LTR-RTs). Among these, we identified a centromere-enriched LTR family, centromeric retrotransposons of jujube (CRJ), which is particularly abundant in terminal LTRs compared to the internal transposon regions. Comparative analysis across plant species revealed that centromeric retrotransposons primarily fall into three subfamilies—CRM, Tekay, and Athila—highlighting strong subfamily specificity. Notably, early insertions of CRJ-derived LTR segments contributed to the formation of TR-like structures, suggesting a mechanistic link between transposable elements and the evolution of centromeric tandem repeats. This work provides the first in-depth characterization of a TE-dominated centromere architecture in a fruit tree, offering new insights into the diversity and evolution of plant centromeres.
- Book Chapter
- 10.1007/978-3-030-31005-9_5
- Jan 1, 2021
The release of the Brassica oleracea draft genome sequence opens numerous opportunities to understand its genome structure and evolution. A 515-Mb (82% of the total genome) high-quality draft assembly was made up of 56% repetitive elements (REs). Although the contribution of REs to genome structures, organization and evolution is relatively poorly understood, advances in bioinformatics have enabled genome-wide quantification and downstream analyses of REs in plant genomes. This chapter provides an overview of the classification, abundance, and genomic organization of the major types of REs that make up the main repeat component in the B. oleracea genome. Eight elements namely, 5S and 45S nrDNA, centromeric and subtelomeric tandem repeats (CentBo1, CentBo2, BoSTRa/b, and BoSTRc), a centromeric retrotransposon (BoCRB), and a Ty1/copia LTR retrotransposon (BoCopia-1) were classified into this repeat component. Whole-genome shotgun (WGS) mapping and molecular cytogenetic analyses provided an in-depth view of the abundance and distribution of these repeats both in the in silico generated draft assembly and mitotic metaphase chromosomes. The information not only validates the abundance of repeat elements in draft genomes, but also provide an avenue for understanding overall genome structure.
- Research Article
109
- 10.1073/pnas.1512255112
- Oct 21, 2015
- Proceedings of the National Academy of Sciences
Holocentric chromosomes lack a primary constriction, in contrast to monocentrics. They form kinetochores distributed along almost the entire poleward surface of the chromatids, to which spindle fibers attach. No centromere-specific DNA sequence has been found for any holocentric organism studied so far. It was proposed that centromeric repeats, typical for many monocentric species, could not occur in holocentrics, most likely because of differences in the centromere organization. Here we show that the holokinetic centromeres of the Cyperaceae Rhynchospora pubera are highly enriched by a centromeric histone H3 variant-interacting centromere-specific satellite family designated "Tyba" and by centromeric retrotransposons (i.e., CRRh) occurring as genome-wide interspersed arrays. Centromeric arrays vary in length from 3 to 16 kb and are intermingled with gene-coding sequences and transposable elements. We show that holocentromeres of metaphase chromosomes are composed of multiple centromeric units rather than possessing a diffuse organization, thus favoring the polycentric model. A cell-cycle-dependent shuffling of multiple centromeric units results in the formation of functional (poly)centromeres during mitosis. The genome-wide distribution of centromeric repeat arrays interspersing the euchromatin provides a previously unidentified type of centromeric chromatin organization among eukaryotes. Thus, different types of holocentromeres exist in different species, namely with and without centromeric repetitive sequences.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.