Evolutionary conservation and enhanced basal immunity of the ZmNBS gene family in maize
The nucleotide-binding site (NBS) gene family is central to plant innate immunity. However, a comprehensive understanding of its evolutionary dynamics and functional diversity in maize, particularly within a pan-genomic context, remains limited. We conducted a systematic pan-genomic analysis of the ZmNBS gene family across 26 representative maize inbred lines. Our approach integrated evolutionary genetics, structural variation analysis, and expression profiling to investigate presence-absence variation (PAV), duplication modes, evolutionary rates, and the impact of structural variants (SVs). We observed extensive presence–absence variation (PAV), distinguishing conserved “core” subgroups (ZmNBS31 and ZmNBS17-19) from highly variable ones (ZmNBS1-10 and ZmNBS43-60), thereby supporting a “core-adaptive” model of resistance gene evolution. Duplication mode analysis revealed subtype-specific preferences: canonical CNL/CN genes largely originated from dispersed duplications, while N-type genes were enriched in tandem duplications. Evolutionary rate analysis showed that whole-genome duplication (WGD)-derived genes exhibited strong purifying selection (low Ka/Ks), whereas tandem and proximal duplications (TD/PD) showed signs of relaxed or positive selection. Structural variants (SVs) were associated with altered motif structures and significantly impacted gene expression. Notably, ZmNBS31 emerged as a conserved, highly expressed gene under both stressed and control conditions, underscoring its potential role in basal immunity. Our findings demonstrate how duplication mechanisms, structural variations and differential selection pressures collectively shape the evolution of the ZmNBS gene family. The identification of ZmNBS31 as a candidate for basal immunity, along with our established "core-adaptive" framework, provides valuable insights and a conceptual foundation for identifying and improving broad-spectrum resistance genes in maize breeding programs.
- Research Article
1
- 10.1093/plcell/koaf243
- Oct 8, 2025
- The Plant cell
Gene and genome duplications expand genetic repertoires and facilitate functional innovation. Segmental or whole-genome duplications generate duplicates with similar and somewhat redundant expression profiles across multiple tissues, while other modes of duplication create genes that show increased divergence, leading to functional innovations. How duplicates diverge in expression across cell types in a single tissue remains elusive. Here, we used high-resolution spatial transcriptomic data from Arabidopsis thaliana, Glycine max, Phalaenopsis aphrodite, Zea mays, and Hordeum vulgare to investigate the evolution of gene expression following gene duplication. We found that genes originating from segmental or whole-genome duplications display increased expression levels, expression breadths, spatial variability, and number of coexpression partners. Duplication mechanisms that preserve cis-regulatory landscapes typically generate paralogs with more preserved expression profiles, but such differences generated by mode of duplication fade or disappear over time. Paralogs originating from large-scale (including whole-genome) duplications display redundant or overlapping expression profiles, indicating functional redundancy or subfunctionalization, while most small-scale duplicates diverge asymmetrically, consistent with neofunctionalization. Expression divergence also depends on gene functions, with dosage-sensitive genes displaying highly preserved expression profiles and genes involved in more specialized processes diverging more rapidly. Our findings offer a spatially resolved view of expression divergence following duplication, elucidating the tempo and mode of gene expression evolution, and helping understand how gene and genome duplications shape cell identities.
- Research Article
792
- 10.1186/s13059-019-1650-2
- Feb 21, 2019
- Genome Biology
BackgroundThe sharp increase of plant genome and transcriptome data provide valuable resources to investigate evolutionary consequences of gene duplication in a range of taxa, and unravel common principles underlying duplicate gene retention.ResultsWe survey 141 sequenced plant genomes to elucidate consequences of gene and genome duplication, processes central to the evolution of biodiversity. We develop a pipeline named DupGen_finder to identify different modes of gene duplication in plants. Genes derived from whole-genome, tandem, proximal, transposed, or dispersed duplication differ in abundance, selection pressure, expression divergence, and gene conversion rate among genomes. The number of WGD-derived duplicate genes decreases exponentially with increasing age of duplication events—transposed duplication- and dispersed duplication-derived genes declined in parallel. In contrast, the frequency of tandem and proximal duplications showed no significant decrease over time, providing a continuous supply of variants available for adaptation to continuously changing environments. Moreover, tandem and proximal duplicates experienced stronger selective pressure than genes formed by other modes and evolved toward biased functional roles involved in plant self-defense. The rate of gene conversion among WGD-derived gene pairs declined over time, peaking shortly after polyploidization. To provide a platform for accessing duplicated gene pairs in different plants, we constructed the Plant Duplicate Gene Database.ConclusionsWe identify a comprehensive landscape of different modes of gene duplication across the plant kingdom by comparing 141 genomes, which provides a solid foundation for further investigation of the dynamic evolution of duplicate genes.
- Research Article
20
- 10.1371/journal.pone.0155637
- May 19, 2016
- PLOS ONE
Different modes of gene duplication including whole-genome duplication (WGD), and tandem, proximal and dispersed duplications are widespread in angiosperm genomes. Small-scale, stochastic gene relocations and transposed gene duplications are widely accepted to be the primary mechanisms for the creation of dispersed duplicates. However, here we show that most surviving ancient dispersed duplicates in core eudicots originated from large-scale gene relocations within a narrow window of time following a genome triplication (γ) event that occurred in the stem lineage of core eudicots. We name these surviving ancient dispersed duplicates as relocated γ duplicates. In Arabidopsis thaliana, relocated γ, WGD and single-gene duplicates have distinct features with regard to gene functions, essentiality, and protein interactions. Relative to γ duplicates, relocated γ duplicates have higher non-synonymous substitution rates, but comparable levels of expression and regulation divergence. Thus, relocated γ duplicates should be distinguished from WGD and single-gene duplicates for evolutionary investigations. Our results suggest large-scale gene relocations following the γ event were associated with the diversification of core eudicots.
- Research Article
145
- 10.1371/journal.pone.0028150
- Dec 2, 2011
- PLoS ONE
BackgroundBoth single gene and whole genome duplications (WGD) have recurred in angiosperm evolution. However, the evolutionary effects of different modes of gene duplication, especially regarding their contributions to genetic novelty or redundancy, have been inadequately explored.ResultsIn Arabidopsis thaliana and Oryza sativa (rice), species that deeply sample botanical diversity and for which expression data are available from a wide range of tissues and physiological conditions, we have compared expression divergence between genes duplicated by six different mechanisms (WGD, tandem, proximal, DNA based transposed, retrotransposed and dispersed), and between positional orthologs. Both neo-functionalization and genetic redundancy appear to contribute to retention of duplicate genes. Genes resulting from WGD and tandem duplications diverge slowest in both coding sequences and gene expression, and contribute most to genetic redundancy, while other duplication modes contribute more to evolutionary novelty. WGD duplicates may more frequently be retained due to dosage amplification, while inferred transposon mediated gene duplications tend to reduce gene expression levels. The extent of expression divergence between duplicates is discernibly related to duplication modes, different WGD events, amino acid divergence, and putatively neutral divergence (time), but the contribution of each factor is heterogeneous among duplication modes. Gene loss may retard inter-species expression divergence. Members of different gene families may have non-random patterns of origin that are similar in Arabidopsis and rice, suggesting the action of pan-taxon principles of molecular evolution.ConclusionGene duplication modes differ in contribution to genetic novelty and redundancy, but show some parallels in taxa separated by hundreds of millions of years of evolution.
- Research Article
98
- 10.1093/molbev/mss162
- Jul 13, 2012
- Molecular Biology and Evolution
Gene duplicates are a major source of evolutionary novelties in the form of new or specialized functions and play a key role in speciation. Gene duplicates are generated through whole genome duplications (WGD) or small-scale genome duplications (SSD). Although WGD preserves the stoichiometric relationships between duplicates, those arising from SSD are usually unbalanced and are expected to follow different evolutionary dynamics than those formed by WGD. To dissect the role of the mechanism of duplication in these differential dynamics and determine whether this role was shared across species, we performed a genome wide evolutionary analysis of gene duplications arising from the most recent WGD events and contemporary episodes of SSD in four model species representing distinct plant evolutionary lineages. We found an excess of relaxed purifying selection after duplication in SSD paralogs compared with WGD, most of which may have been the result of functional divergence events between gene copies as estimated by measures of genetic distances. These differences were significant in three angiosperm genomes but not in the moss species Physcomitrella patens. Although the comparison of models of evolution does not attribute a relevant role to the mechanism of duplication in the evolution duplicates, distribution of retained genes among Gene Ontology functional categories support the conclusion that evolution of gene duplicates depends on its origin of duplication (WGD and SSD) but, most importantly, on the species. Similar lineage-specific biases were also observed in protein network connectivity, translational efficiency, and selective constraints acting on synonymous codon usage. Although the mechanism of duplication may determine gene retention, our results attribute a dominant role to the species in determining the ultimate pattern of duplicate gene retention and reveal an unanticipated complexity in the evolutionary dynamics and functional specialization of duplicated genes in plants.
- Research Article
59
- 10.1016/j.celrep.2012.09.034
- Nov 1, 2012
- Cell Reports
On the Expansion of “Dangerous” Gene Repertoires by Whole-Genome Duplications in Early Vertebrates
- Research Article
30
- 10.1186/s12864-016-3423-6
- Jan 6, 2017
- BMC Genomics
BackgroundAll extant seed plants are successful paleopolyploids, whose genomes carry duplicate genes that have survived repeated episodes of diploidization. However, the survival of gene duplicates is biased with respect to gene function and mechanism of duplication. Transcription factors, in particular, are reported to be preferentially retained following whole-genome duplications (WGDs), but disproportionately lost when duplicated by tandem events. An explanation for this pattern is provided by the Gene Balance Hypothesis (GBH), which posits that duplicates of highly connected genes are retained following WGDs to maintain optimal stoichiometry among gene products; but such connected gene duplicates are disfavored following tandem duplications.ResultsWe used genomic data from 25 taxonomically diverse plant species to investigate the roles of duplication mechanism, gene function, and age of duplication in the retention of duplicate genes. Enrichment analyses were conducted to identify Gene Ontology (GO) functional categories that were overrepresented in either WGD or tandem duplications, or across ranges of divergence times. Tandem paralogs were much younger, on average, than WGD paralogs and the most frequently overrepresented GO categories were not shared between tandem and WGD paralogs. Transcription factors were overrepresented among ancient paralogs regardless of mechanism of origin or presence of a WGD. Also, in many cases, there was no bias toward transcription factor retention following recent WGDs.ConclusionsBoth the fixation and the retention of duplicated genes in plant genomes are context-dependent events. The strong bias toward ancient transcription factor duplicates can be reconciled with the GBH if selection for optimal stoichiometry among gene products is strongest following the earliest polyploidization events and becomes increasingly relaxed as gene families expand.
- Preprint Article
- 10.1101/2025.05.27.656262
- May 27, 2025
Gene and genome duplications are key drivers of plant genome evolution, expanding genetic repertoires and facilitating functional innovation. However, genes originating from different duplication mechanisms undergo different outcomes, especially at the expression level. Previous studies have demonstrated that segmental or whole-genome duplications generate duplicates with similar and somewhat redundant expression profiles across multiple tissues, while other duplicates display increased divergence, ultimately leading to functional innovations. However, little is known about how duplicates diverge in expression across cell types in a single tissue. Here, we used high-resolution spatial transcriptomic data from five species (Arabidopsis thaliana, Glycine max, Phalaenopsis aphrodite, Zea mays,andHordeum vulgare) to investigate the evolution of gene expression following gene duplications. We found that genes originating from segmental or whole-genome duplications display increased expression levels, expression breadths, spatial variability, and number of coexpression partners. Duplication mechanisms that preserve cis-regulatory landscapes typically generate paralogs with more preserved expression profiles, but such differences by duplication mode disappear over time. Expression divergence also depends on gene functions, with dosage-sensitive gene families displaying highly preserved expression profiles, while families involved in more specialized processes (e.g., flowering and phytohormone biosynthesis) display increased divergence. Paralogs originating from large-scale (including whole-genome) duplications display redundant and/or overlapping expression profiles, indicating functional redundancy and/or subfunctionalization, while small-scale duplicates diverge asymmetrically, indicating neofunctionalization. Collectively, our findings provide new insights into the tempo and mode of gene expression evolution, helping understand how gene and genome duplications shape cell identities.
- Research Article
10
- 10.3389/fgene.2020.601003
- Dec 8, 2020
- Frontiers in Genetics
Regulatory changes include divergence in both cis-elements and trans-factors, which play roles in organismal evolution. Whole genome duplications (WGD) followed by diploidization are a recurrent feature in the evolutionary history of angiosperms. Prior studies have shown that duplicated genes have different evolutionary fates due to variable selection constraints and results in genomic compositions with hallmarks of paleopolyploidy. The recent sequential WGDs and post-WGD evolution in the common ancestor of cultivated soybean (Glycine max) and wild soybean (Glycine soja), together with other models of gene duplication, have resulted in a highly duplicated genome. In this study, we investigated the transcriptional changes in G. soja and G. max. We identified a sizable proportion of interspecific differentially expressed genes (DEGs) and found parental expression level dominance of G. max in their F1 hybrids. By classifying genes into different regulatory divergence types, we found the trans-regulatory changes played a predominant role in transcriptional divergence between wild and cultivated soybean. The same gene ontology (GO) and protein family (Pfam) terms were found to be over-represented in DEGs and genes of cis-only between JY47 and GS, suggesting the substantial contribution of cis-regulatory divergences to the evolution of wild and cultivated soybeans. By further dissecting genes into five different duplication modes, we found genes in different duplication modes tend to accumulate different types of regulatory differences. A relatively higher proportion of cis-only regulatory divergences was detected in singleton, dispersed, proximal, and tandem duplicates than WGD duplicates and genome-wide level, which is in line with the prediction of gene balance hypothesis for the differential fates of duplicated genes post-WGD. The numbers of cis-only and trans-only regulated genes were similar for singletons, whereas there were more genes of trans-only than cis-only in the rest duplication types, especially in WGD in which there were two times more trans-only genes than that in cis-only type. Tandem duplicates showed the highest proportion of trans-only genes probably due to some special features of this class. In summary, our results demonstrate that genes in different duplication modes have different fates in transcriptional evolution underpinned by cis- or trans-regulatory divergences in soybean and likely in other paleopolyploid higher organisms.
- Research Article
38
- 10.1093/molbev/msaa309
- Nov 28, 2020
- Molecular Biology and Evolution
Genomic variation in the model plant Arabidopsis thaliana has been extensively used to understand evolutionary processes in natural populations, mainly focusing on single-nucleotide polymorphisms. Conversely, structural variation has been largely ignored in spite of its potential to dramatically affect phenotype. Here, we identify 155,440 indels and structural variants ranging in size from 1 bp to 10 kb, including presence/absence variants (PAVs), inversions, and tandem duplications in 1,301 A. thaliana natural accessions from Morocco, Madeira, Europe, Asia, and North America. We show evidence for strong purifying selection on PAVs in genes, in particular for housekeeping genes and homeobox genes, and we find that PAVs are concentrated in defense-related genes (R-genes, secondary metabolites) and F-box genes. This implies the presence of a “core” genome underlying basic cellular processes and a “flexible” genome that includes genes that may be important in spatially or temporally varying selection. Further, we find an excess of intermediate frequency PAVs in defense response genes in nearly all populations studied, consistent with a history of balancing selection on this class of genes. Finally, we find that PAVs in genes involved in the cold requirement for flowering (vernalization) and drought response are strongly associated with temperature at the sites of origin.
- Research Article
8
- 10.1093/bioinformatics/btaf043
- Jan 25, 2025
- Bioinformatics (Oxford, England)
Gene and genome duplications are major evolutionary forces that shape the diversity and complexity of life. However, different duplication modes have distinct impacts on gene function, expression, and regulation. Existing tools for identifying and classifying duplicated genes are either outdated or not user-friendly. Here, we present doubletrouble, an R/Bioconductor package that provides a comprehensive and robust framework for analyzing duplicated genes from genomic data. doubletrouble can detect and classify gene pairs as derived from six duplication modes (segmental, tandem, proximal, retrotransposon-derived, DNA transposon-derived, and dispersed duplications), calculate substitution rates, detect signatures of putative whole-genome duplication events, and visualize results as publication-ready figures. We applied doubletrouble to classify the duplicated gene repertoire in 822 eukaryotic genomes, and results were made available through a user-friendly web interface. doubletrouble is available on Bioconductor (https://bioconductor.org/packages/doubletrouble), and the source code is available in a GitHub repository (https://github.com/almeidasilvaf/doubletrouble). doubletroubledb is available online at https://almeidasilvaf.github.io/doubletroubledb/.
- Research Article
22
- 10.1093/molbev/msad239
- Nov 3, 2023
- Molecular Biology and Evolution
Gene duplication generates new genetic material that can contribute to the evolution of gene regulatory networks and phenotypes. Duplicated genes can undergo subfunctionalization to partition ancestral functions and/or neofunctionalization to assume a new function. We previously found there had been a whole genome duplication (WGD) in an ancestor of arachnopulmonates, the lineage including spiders and scorpions but excluding other arachnids like mites, ticks, and harvestmen. This WGD was evidenced by many duplicated homeobox genes, including two Hox clusters, in spiders. However, it was unclear which homeobox paralogues originated by WGD versus smaller-scale events such as tandem duplications. Understanding this is a key to determining the contribution of the WGD to arachnopulmonate genome evolution. Here we characterized the distribution of duplicated homeobox genes across eight chromosome-level spider genomes. We found that most duplicated homeobox genes in spiders are consistent with an origin by WGD. We also found two copies of conserved homeobox gene clusters, including the Hox, NK, HRO, Irx, and SINE clusters, in all eight species. Consistently, we observed one copy of each cluster was degenerated in terms of gene content and organization while the other remained more intact. Focussing on the NK cluster, we found evidence for regulatory subfunctionalization between the duplicated NK genes in the spider Parasteatoda tepidariorum compared to their single-copy orthologues in the harvestman Phalangium opilio. Our study provides new insights into the relative contributions of multiple modes of duplication to the homeobox gene repertoire during the evolution of spiders and the function of NK genes.
- Research Article
- 10.3389/fpls.2024.1477383
- Oct 28, 2024
- Frontiers in plant science
Casparian strip membrane domain proteins like (CASPL), exhibit profound associations with root development, stress responsiveness and mineral element uptake in plants. Nonetheless, a comprehensive bioinformatics analysis of the ZmCASPL gene family in maize remains unreported. In the study, we have identified 47 ZmCASPL members at the whole-genome level, systematically classifying them into six distinct groups. Furthermore, our analysis revealed that the same group of ZmCASPL contains similar gene structures and conserved motifs. Duplication events showed whole genome duplication (WGD) and tandem duplication (TD) contribute to the generation of the ZmCASPL gene family together in maize, but the former plays a more prominent role. Furthermore, we observed that most ZmCASPL genes contain MYB-binding sites (CAACCA), which are associated with the Casparian strip. Utilizing RNA-seq data, we found that ZmCASPL21 and ZmCASPL47 are specifically highly expressed only in the roots. This finding implies that ZmCASPL21 and ZmCASPL47 may be involved in the Casparian strip development. Additionally, RNA-seq analysis illuminated that drought, salt, heat, cold stresses, low nitrogen and phosphorus conditions, as well as pathogen infection, significantly impact the expression patterns of ZmCASPL genes. RT-qPCR revealed that ZmCASPL 5/13/25/44 genes showed different expression patterns under PEG and NaCl treatments. Collectively, these findings provide a robust theoretical foundation for further investigations into the functional roles of the ZmCASPL gene family in maize.
- Research Article
8
- 10.1016/j.cell.2007.10.001
- Oct 1, 2007
- Cell
When Two Is Better Than One
- Peer Review Report
- 10.7554/elife.81224.sa1
- Oct 28, 2022
Decision letter: Pan-cancer association of DNA repair deficiencies with whole-genome mutational patterns
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.