Abstract

Nut weight is one of the most important traits that can affect a chestnut grower’s returns. Due to the long juvenile phase of chestnut trees, the selection of desired characteristics at early developmental stages represents a major challenge for chestnut breeding. In this study, we identified single nucleotide polymorphisms (SNPs) in transcriptomic regions, which were significantly associated with nut weight in chestnuts (Castanea crenata), using a genome-wide association study (GWAS). RNA-sequencing (RNA-seq) data were generated from large and small nut-bearing trees, using an Illumina HiSeq. 2000 system, and 3,271,142 SNPs were identified. A total of 21 putative SNPs were significantly associated with chestnut weight (false discovery rate [FDR] < 10−5), based on further analyses. We also applied five machine learning (ML) algorithms, support vector machine (SVM), C5.0, k-nearest neighbour (k-NN), partial least squares (PLS), and random forest (RF), using the 21 SNPs to predict the nut weights of a second population. The average accuracy of the ML algorithms for the prediction of chestnut weights was greater than 68%. Taken together, we suggest that these SNPs have the potential to be used during marker-assisted selection to facilitate the breeding of large chestnut-bearing varieties.

Highlights

  • The chestnut is widely cultivated as a food crop in Asian and European countries, due to its high nutrient contents combined with its low fat content[1,2]

  • According to a report published by the Food and Agriculture Organization (FAO) in the United States, chestnut production has steadily increased in Asia through 2014, and chestnuts grown in Asia accounted for 89.6% (1.8 million tons) of the world chestnut production that year[5]

  • Through our association study and the Machine learning (ML) approaches, we identified 21 single nucleotide polymorphisms (SNPs) associated with nut weights that were able to clearly discriminate between large and small nut-bearing populations

Read more

Summary

Introduction

The chestnut is widely cultivated as a food crop in Asian and European countries, due to its high nutrient contents combined with its low fat content[1,2]. The rapid growth in next-generation sequencing technologies (NGS), combined with various statistical methods, has facilitated the use of genome-wide association studies (GWASs)[13]. Transcriptome-based analysis practically reduces the burden of multiple testing for traditional GWASs. the gene is the functional unit in the genome with high consistency across populations, which is the major target used by most of the subsequent bioinformatics analyses[16]. ML is the most effective method for predicting phenotypes based on genotypes and has been widely applied in various population studies[20,21]. Our study aimed to predict the transcriptome-wide SNPs that are closely associated with nut weights. This study represents the first attempt to identify highly significant SNPs associated with nut weights in Korean chestnut trees

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.