Evaluating the discriminatory ability and informativeness of DArTseq markers in a comprehensive set of contemporary European potato varieties
<br />High-throughput molecular technology DArTseq generates markers for potential use in molecular breeding of crops. Using DArTseq, we analysed a comprehensive set of 333 European potato varieties reflecting the outcomes of long-term breeding history and representing a potential germplasm for future breeding of potatoes in the Central European region. The varieties were classified according to four factors: region of origin, breeder, earliness and utilisation mode, that may potentially reflect their genetic structure, and for which complete data were publicly available. The DArTseq analysis was performed by the service centre, the Diversity Array Technology (University of Canberra), which generated approximately 38 000 silicoDArT and 64 000 SNP (single nucleotide polymorphism) polymorphic markers. The discriminatory ability of the markers in relation to the factors was confirmed using neighbour-joining and principal coordinate analysis (PCoA), while the informativeness was assessed using the discriminant analysis of principal components (DAPC). The analyses identified the 50 SNPs most strongly associated with each factor, along with their highly probable chromosomal localisation. Herein presented research contributes to the evaluation of potato genetic resources by adding the novel molecular data of active germplasm and implies their future utilisation in genome wide association studies and marker assisted selection.
- Research Article
8
- 10.3390/agronomy14050985
- May 8, 2024
- Agronomy
An in-depth understanding of the extent and pattern of genetic diversity and population structure in crop populations is of paramount importance for any crop improvement program to efficiently promote the translation of genetic diversity into genetic gain. A reference collection of 150 common bean genotypes selected from the International Center for Tropical Agriculture’s global core collection was evaluated using single-nucleotide polymorphism (SNP) markers to quantify the amount of genetic diversity, linkage disequilibrium, and population structure. The cultivars and landraces of the collection were diverse and originated from 14 countries, and wild accessions were used as controls for each gene pool. The collection was genotyped using an SNP array, generating a total of 5398 locus calls distributed across the entire bean genome. The SNP data quality was checked, and two datasets were generated. The first dataset (Dataset_1) comprised a set of 5108 SNPs and 150 genotypes after filtering for 10% missing alleles and an MAF < 0.05. The second dataset (Dataset_2) comprised a set of 2300 SNPs that remained after removing any null-allele SNPs and LD pruning for a criterion of r2 < 0.2. Dataset_1 was used for a principal coordinate analysis (PCoA), phylogenetic relationship determination, an analysis of molecular variance (AMOVA), and a discriminant analysis of principal components. Dataset_2 was used for a population structure analysis using STRUCTURE software and is proposed for a genome-wide association study (GWAS). The population structure analysis split the reference collection into two subpopulations according to an Andean or Mesoamerican gene pool. The Mesoamerican populations displayed higher genetic differentiation and tended to split into more groups that were somewhat aligned with common bean races. Andean beans were characterized by a larger average LD but lower LD percentage, a small average genetic distance between members of the population, and a higher major allele frequency, which suggested narrower genetic diversity compared to the Mesoamerican gene pool. In conclusion, the results indicated the presence of high genetic diversity, which is useful for a GWAS. However, the presence of significant linkage disequilibrium requires that genetic distance be considered as a co-factor for any further genetic studies. Overall, the molecular variation observed in the genotypes shows that this reference collection is valuable as a genebank-derived diversity panel which is useful for marker trait association studies.
- Research Article
24
- 10.1002/tpg2.20113
- Jul 18, 2021
- The plant genome
Cowpea [Vigna unguiculata (L.) Walp] is a globally important food security crop. However, it is susceptible to pest and disease; hence, constant breeding efforts based on its diversity are required for its improvement. The present study aims to investigate the genetic diversity, population structure, and linkage disequilibrium (LD) among 274 cowpea accessions from different origins. A total of 3,127 single nucleotide polymorphism (SNP) markers generated using diversity array technology (DArT) was used. Population structure, neighbor-joining clustering, and principal component analyses indicated three subpopulations within the germplasm. Results of STRUCTURE analysis and discriminant analysis of principal components (DAPC) were complementary in assessing the structuration of the diversity among the germplasm, with the grouping of the accessions improved in DAPC. Genetic distances of 0.005-0.44 were observed among accessions. Accessions from western and central Africa, eastern and central Africa, and Asia were predominant and distributed across all subpopulations. The subpopulations had fixation indexes of 0.48-0.56. Analysis of molecular variance revealed that within subpopulation variation accounted for 81% of observed genetic variation in the germplasm. The subpopulations mainly consisted of inbred lines (inbreeding coefficient = 1) with common alleles, although they were from different geographical regions. This reflects considerable seed movement and germplasm exchange between regions. The LD was characterized by low decay for great physical distances between markers. The LD decay distance varied among chromosomes with the average distance of 80-100 kb across the genome. Thus, crop improvement is possible, and the LD will facilitate genome-wide association studies on quality attributes and critical agronomic traits in cowpea.
- Peer Review Report
- 10.7554/elife.80009.sa1
- Jun 22, 2022
Decision letter: Data-driven, participatory characterization of farmer varieties discloses teff breeding potential under current and future climates
- Research Article
1
- 10.1002/agj2.21153
- Aug 29, 2022
- Agronomy Journal
The inclusion of high‐oil maize germplasm into breeding programs may be an excellent alternative for increasing grain nutritional quality. Knowing the germplasm genetic diversity is crucial for assisting breeding programs. Here, the genetic diversity and population structure of four high‐oil‐content maize (Zea mays L.) populations (Bajio yellow population [BYP], northwestern yellow population [NYP], Bajio white population [BWP], and northwestern white population [NWP]) were analyzed by Diversity Arrays Technology sequencing. Three‐hundred ten double‐haploid (DH) lines were genotyped, and 19,078 single‐nucleotide polymorphism (SNP) markers were uniformly detected among populations after filtering by missing data >20% and minor allele frequency ≥0.05. Genetic diversity indexes showed polymorphic information content (PIC) values of 0.346, 0.352, 0.353, and 0.353; observed heterozygosity values of 0.221, 0.194, 0.284, and 0.177; and expected heterozygosity values of 0.188, 0.165, 0.219, and 0.152, for BYP, NYP, BWP, and NWP, respectively. Genetic structure results showed variations in pairwise genetic distance comparisons among the 310 DH lines, ranging from 0.119 to 0.385. Multidimensional scaling analysis and discriminant analysis of principal components grouped the DH lines into three different and five clusters, respectively, based on their origin region and grain color. On the other hand, STRUCTURE analysis revealed the presence of two different groups unrelated to grain color or origin region. The wide genetic variability among the analyzed DH lines highlights their potential to contribute new beneficial alleles into subtropical maize breeding programs and will facilitate the selection of parental lines and the identification of heterotic groups to generate high‐oil maize hybrids.
- Research Article
18
- 10.1007/s13353-014-0214-0
- May 1, 2014
- Journal of Applied Genetics
A set of about 100 winter barley (Hordeum vulgare L.) cultivars, comprising diverse and economically important German barley elite germplasm released during the last six decades, was previously genotypically characterized by single nucleotide polymorphism (SNP) markers using the Illumina GoldenGate BeadArray Technology to detect associations with phenotypic data estimated in three-year field trials at 12 locations. In order to identify further associations and to obtain information on whether the marker type influences the outcome of association genetics studies, the set of winter barley cultivars was re-analyzed using Diversity Arrays Technology (DArT) markers. As with the analysis of the SNPs, only polymorphic markers present at an allele frequency >5% were included to detect associations in a mixed linear model (MLM) approach using the TASSEL software (P ≤ 0.001). The population structure and kinship matrix were estimated on 72 simple sequence repeats (SSRs) covering the whole barley genome. The respective average linkage disequilibrium (LD) analyzed with DArT markers was estimated at 5.73 cM. A total of 52 markers gave significant associations with at least one of the traits estimated which, therefore, may be suitable for marker-assisted breeding. In addition, by comparing the results to those generated using the Illumina GoldenGate BeadArray Technology, it turned out that a different number of associations for respective traits is detected, depending on the marker system. However, as only a few of the respective DArT and Illumina markers are present in a common map, no comprehensive comparison of the detected associations was feasible, but some were probably detected in the same chromosomal regions. Because of the identification of additional marker-trait associations, it may be recommended to use both marker techniques in genome-wide association studies.
- Research Article
9
- 10.3389/fsufs.2023.1202015
- Jun 30, 2023
- Frontiers in Sustainable Food Systems
Cassava adaptation to climate change and its resistance to diseases are essential prerequisites for achieving food security in sub-Saharan Africa. The accessions collected from farmers’ fields are very important because they can provide new sources of genetic variability that are essential to achieve this goal. In this study, a panel of 184 accessions collected in Burkina Faso was genotyped using 36 single nucleotide polymorphism (SNP) markers. The accessions and markers that presented with more than 6% missing data were removed from the dataset and the remaining 34 markers and 166 accessions were retained for genetic diversity and population structure assessment. The average values of expected heterozygosity (0.46), observed heterozygosity (0.58), and polymorphic information content (0.36) indicated high genetic diversity within accessions. A complex genetic structure of 166 accessions was observed through the formation of 17 clusters using discriminant analysis of principal components (DAPC) and two clusters using Bayesian analysis. Out of the 166 accessions, 79 were unique multilocus genotypes (MLGs) and 87 were potentially duplicates. From the 79 MLGs, DAPC suggested eight clusters while the Bayesian analysis suggested seven clusters. Clusters shaped by DAPC appeared to be more consistent with a higher probability of assignment of the accessions within the clusters. Principal Coordinate Analysis (PCoA) showed a lack of clustering according to geographical origin. Information related to breeding patterns and geographic origin did not allow for a clear differentiation between the clusters according to the analysis of molecular variance (AMOVA). The results of this study will be useful for cassava germplasm conservation and breeding programs.
- Research Article
63
- 10.1186/s12864-017-4173-9
- Oct 12, 2017
- BMC Genomics
BackgroundMolecular characterization is important for efficient utilization of germplasm and development of improved varieties. In the present study, we investigated the genetic purity, relatedness and population structure of 265 maize inbred lines from the Ethiopian Institute of Agricultural Research (EIAR), the International Maize and Wheat Improvement Centre (CIMMYT) and the International Institute of Tropical Agriculture (IITA) using 220,878 single nucleotide polymorphic (SNP) markers obtained using genotyping by sequencing (GBS).ResultsOnly 22% of the inbred lines were considered pure with <5% heterogeneity, while the remaining 78% of the inbred lines had a heterogeneity ranging from 5.1 to 31.5%. Pairwise genetic distances among the 265 inbred lines varied from 0.011 to 0.345, with 89% of the pairs falling between 0.301 and 0.345. Only <1% of the pairs had a genetic distance lower than 0.200, which included 14 pairs of sister lines that were nearly identical. Relative kinship analysis showed that the kinship coefficients for 59% of the pairs of lines was close to zero, which agrees with the genetic distance estimates. Principal coordinate analysis, discriminant analysis of principal components (DAPC) and the model-based population structure analysis consistently suggested the presence of three groups, which generally agreed with pedigree information (genetic background). Although not distinct enough, the SNP markers showed some level of separation between the two CIMMYT heterotic groups A and B established based on pedigree and combining ability information.ConclusionsThe high level of heterogeneity detected in most of the inbred lines suggested the requirement for purification or further inbreeding except those deliberately maintained at early inbreeding level. The genetic distance and relative kinship analysis clearly indicated the uniqueness of most of the inbred lines in the maize germplasm available for breeders in the mid-altitude maize breeding program of Ethiopia. Results from the present study facilitate the maize breeding work in Ethiopia and germplasm exchange among breeding programs in Africa. We suggest the incorporation of high density molecular marker information in future heterotic group assignments.
- Front Matter
3
- 10.4065/mcp.2011.0337
- Jul 1, 2011
- Mayo Clinic Proceedings
Genome-Wide Association Studies Go Green: Novel and Cost-Effective Opportunities for Identifying Genetic Associations
- Research Article
12
- 10.1186/s12864-023-09768-6
- Nov 14, 2023
- BMC Genomics
BackgroundDurum wheat is one of the most important crops, especially in the Mediterranean region. Insight into the genetic diversity of germplasm can improve the breeding program management in various traits. This study was done using single nucleotide polymorphisms (SNP) markers to characterize the genetic distinctiveness and differentiation of tetraploid wheat landraces collected from nine European and Asian countries. A sum of 23,334 polymorphic SNPs was detected in 126 tetraploid wheat landraces in relation to the reference genome.ResultsThe number of identified SNPs was 11,613 and 11,721 in A and B genomes, respectively. The highest and lowest diversity was on 6B and 6 A chromosomes, respectively. Structure analysis classified the landraces into two distinct subpopulations (K = 2). Evaluating the principal coordinate analysis (PCoA) and weighted pair-group method using arithmetic averages (WPGMA) clustering results demonstrated that landraces (99.2%) are categorized into one of the two chief subpopulations. Therefore, the grouping pattern did not clearly show the presence of a clear pattern of relationships between genetic diversity and their geographical derivation. Part of this result could be due to the historical exchange between different germplasms. Although the result did not separate landraces based on their region of origin, the landraces collected from Iran were classified into the same group and cluster. Analysis of molecular variance (AMOVA) also confirmed the results of population structure. Finally, Durum wheat landraces in some countries, including Turkey, Russia, Ukraine, and Afghanistan, were highly diverse, while others, including Iran and China, were low-diversity.ConclusionThe recent study concluded that the 126 tetraploid wheat genotypes and their GBS-SNP markers are very appropriate for quantitative trait loci (QTLs) mapping and genome-wide association studies (GWAS). The core collection comprises two distinct subpopulations. Subpopulation II genotypes are the most diverse genotypes, and if they possess desired traits, they may be used in future breeding programs. The degree of diversity in the landraces of countries can provide the ground for the improvement of new cultivars with international cooperation. linkage disequilibrium (LD) hotspot distribution across the genome was investigated, which provides useful information about the genomic regions that contain intriguing genes.
- Research Article
54
- 10.1007/s11032-011-9678-3
- Dec 22, 2011
- Molecular Breeding
Diversity arrays technology (DArT) and simple sequence repeat (SSR) markers were applied to investigate population structure, extent of linkage disequilibrium and genetic diversity (kinship) on a genome-wide level in European barley (Hordeum vulgare L.) cultivars. A set of 183 varieties could be clearly distinguished into spring and winter types and was classified into five subgroups based on 253 DArT or 22 SSR markers. Despite the fact, that the same number of groups was revealed by both marker types, it could be shown that this grouping was more distinct for the SSRs than the DArTs, when assigned to a Q-matrix by STRUCTURE. This was supported by the findings from principal coordinate analysis, where the SSRs showed a better resolution according to seasonal habit and row number than the DArTs. A considerable influence on the rate of significant associations with malting and kernel quality parameters was revealed by different marker types in this genome-wide association study using general and mixed linear models considering population structure. Fewer spurious associations were observed when population structure was based on SSR rather than on DArT markers. We therefore conclude that it is advisable to use independent marker datasets for calculating population structure and for performing the association analysis.
- Research Article
20
- 10.1016/j.bbmt.2008.11.020
- Jan 1, 2009
- Biology of Blood and Marrow Transplantation
Exploration of the Genetic Basis of GVHD by Genetic Association Studies
- Research Article
8
- 10.3389/fgene.2023.1231027
- Oct 25, 2023
- Frontiers in Genetics
Background: Tunisia harbors a rich collection of unexploited durum wheat landraces (Triticum durum ssp. durum) that have been gradually replaced by elite cultivars since the 1970s. These landraces represent an important potential source for broadening the genetic background of elite durum wheat cultivars and for the introgression of novel genes for key traits, including disease resistance, into these cultivars. Methods: In this study, single nucleotide polymorphism (SNP) markers were used to investigate the genetic diversity and population structure of a core collection of 235 durum wheat accessions consisting mainly of landraces. The high phenotypic and genetic diversity of the fungal pathogen Pyrenophora tritici-repentis (cause of tan spot disease of wheat) in Tunisia allowed the assessment of the accessions for tan spot resistance at the adult plant stage under field conditions over three cropping seasons. A genome-wide association study (GWAS) was performed using a 90k SNP array. Results: Bayesian population structure analysis with 9191 polymorphic SNP markers classified the accessions into two groups, where groups 1 and 2 included 49.79% and 31.49% of the accessions, respectively, while the remaining 18.72% were admixtures. Principal coordinate analysis, the unweighted pair group method with arithmetic mean and the neighbor-joining method clustered the accessions into three to five groups. Analysis of molecular variance indicated that 76% of the genetic variation was among individuals and 23% was between individuals. Genome-wide association analyses identified 26 SNPs associated with tan spot resistance and explained between 8.1% to 20.2% of the phenotypic variation. The SNPs were located on chromosomes 1B (1 SNP), 2B (4 SNPs), 3A (2 SNPs), 3B (2 SNPs), 4A (2 SNPs), 4B (1 SNP), 5A (2 SNPs), 5B (4 SNPs), 6A (5 SNPs), 6B (2 SNPs), and 7B (1 SNP). Four markers, one on each of chromosomes 1B, and 5A, and two on 5B, coincided with previously reported SNPs for tan spot resistance, while the remaining SNPs were either novel markers or closely related to previously reported SNPs. Eight durum wheat accessions were identified as possible novel sources of tan spot resistance that could be introgressed into elite cultivars. Conclusion: The results highlighted the significance of chromosomes 2B, 5B, and 6A as genomic regions associated with tan spot resistance.
- Research Article
9
- 10.1002/leg3.184
- Jan 27, 2023
- Legume Science
Stored grains of common bean (Phaseolus vulgaris L.) develop the hard‐to‐cook trait (HTC), which is manifested in a prolonged cooking time, thereby imposing time and energy constraints. The objective of this study was to determine variation in cooking time among common bean genotypes and to identify single nucleotide polymorphism (SNP) markers associated with cooking time.Seeds of 222 common bean accessions sourced from Kenyan institutions were multiplied in the Jomo Kenyatta University of Agriculture and Technology (JKUAT) field in 2019. The freshly harvested seeds and those stored at 35°C and 50% red haricot (RH) for 4 months for accelerated aging were soaked in distilled water for 16 h and evaluated for cooking time using the finger‐pressing method. The accessions were also genotyped to determine variation in SNP markers using Diversity Arrays Technology Sequencing (DArTseq). Genome‐wide association study (GWAS) analysis was conducted to identify SNPs significantly associated with cooking time.The study revealed significant differences (p ≤ 0.05) within and between fresh and aged bean accessions. Fresh seeds had a lower cooking time with a mean of 40.8 min and ranged from 28.1 to 72.2 min, whereas aged seeds had a higher average cooking time of 54.1 min and ranged from 32.1 to 96.3 min. GWAS identified a region in Chromosome 10 to be significantly (p ≤ 0.05) associated with the cooking time of aged seeds. Consequently, two potential candidate genes Phvul.010G038000 and Phvul.010G038100 were revealed. The characterized common bean accessions and the identified SNP markers can be utilized in breeding programs to improve the cooking quality of the common bean.
- Research Article
2
- 10.1371/journal.pone.0251745.r004
- May 19, 2021
- PLoS ONE
Brazil is the largest consumer of dry edible beans (Phaseolus vulgaris L.) in the world, 70% of consumption is of the carioca variety. Although the variety has high yield, it is susceptible to several diseases, among them, anthracnose (ANT) can lead to losses of up to 100% of production. The most effective strategy to overcome ANT, a disease caused by the fungus Colletotrichum lindemuthianum, is the development of resistant cultivars. For that reason, the selection of carioca genotypes resistant to multiple ANT races and the identification of loci/markers associated with genetic resistance are extremely important for the genetic breeding process. Using a carioca diversity panel (CDP) with 125 genotypes and genotyped by BeadChip BARCBean6K_3 and a carioca segregating population AM (AND-277 × IAC-Milênio) genotyped by sequencing (GBS). Multiple interval mapping (MIM) and genome-wide association studies (GWAS) were used as mapping tools for the resistance genes to the major ANT physiological races present in the country. In general, 14 single nucleotide polymorphisms (SNPs) showed high significance for resistance by GWAS, and loci associated with multiple races were also identified, as the Co-3 locus. The SNPs ss715642306 and ss715649427 in linkage disequilibrium (LD) at the beginning of chromosome Pv04 were associated with all the races used, and 16 genes known to be related to plant immunity were identified in this region. Using the resistant cultivars and the markers associated with significant quantitative resistance loci (QRL), discriminant analysis of principal components (DAPC) was performed considering the allelic contribution to resistance. Through the DAPC clustering, cultivar sources with high potential for durable anthracnose resistance were recommended. The MIM confirmed the presence of the Co-14locus in the AND-277 cultivar which revealed that it was the only one associated with resistance to ANT race 81. Three other loci were associated with race 81 on chromosomes Pv03, Pv10, and Pv11. This is the first study to identify new resistance loci in the AND-277 cultivar. Finally, the same Co-14locus was also significant for the CDP at the end of Pv01. The new SNPs identified, especially those associated with more than one race, present great potential for use in marker-assisted and early selection of inbred lines.
- Research Article
10
- 10.1371/journal.pone.0251745
- May 19, 2021
- PLOS ONE
Brazil is the largest consumer of dry edible beans (Phaseolus vulgaris L.) in the world, 70% of consumption is of the carioca variety. Although the variety has high yield, it is susceptible to several diseases, among them, anthracnose (ANT) can lead to losses of up to 100% of production. The most effective strategy to overcome ANT, a disease caused by the fungus Colletotrichum lindemuthianum, is the development of resistant cultivars. For that reason, the selection of carioca genotypes resistant to multiple ANT races and the identification of loci/markers associated with genetic resistance are extremely important for the genetic breeding process. Using a carioca diversity panel (CDP) with 125 genotypes and genotyped by BeadChip BARCBean6K_3 and a carioca segregating population AM (AND-277 × IAC-Milênio) genotyped by sequencing (GBS). Multiple interval mapping (MIM) and genome-wide association studies (GWAS) were used as mapping tools for the resistance genes to the major ANT physiological races present in the country. In general, 14 single nucleotide polymorphisms (SNPs) showed high significance for resistance by GWAS, and loci associated with multiple races were also identified, as the Co-3 locus. The SNPs ss715642306 and ss715649427 in linkage disequilibrium (LD) at the beginning of chromosome Pv04 were associated with all the races used, and 16 genes known to be related to plant immunity were identified in this region. Using the resistant cultivars and the markers associated with significant quantitative resistance loci (QRL), discriminant analysis of principal components (DAPC) was performed considering the allelic contribution to resistance. Through the DAPC clustering, cultivar sources with high potential for durable anthracnose resistance were recommended. The MIM confirmed the presence of the Co-14 locus in the AND-277 cultivar which revealed that it was the only one associated with resistance to ANT race 81. Three other loci were associated with race 81 on chromosomes Pv03, Pv10, and Pv11. This is the first study to identify new resistance loci in the AND-277 cultivar. Finally, the same Co-14 locus was also significant for the CDP at the end of Pv01. The new SNPs identified, especially those associated with more than one race, present great potential for use in marker-assisted and early selection of inbred lines.