Abstract

BackgroundPresently, multi-omics data (e.g., genomics, transcriptomics, proteomics, and metabolomics) are available to improve genomic predictors. Omics data not only offers new data layers for genomic prediction but also provides a bridge between organismal phenotypes and genome variation that cannot be readily captured at the genome sequence level. Therefore, using multi-omics data to select feature markers is a feasible strategy to improve the accuracy of genomic prediction. In this study, simultaneously using whole-genome sequencing (WGS) and gene expression level data, four strategies for single-nucleotide polymorphism (SNP) preselection were investigated for genomic predictions in the Drosophila Genetic Reference Panel.ResultsUsing genomic best linear unbiased prediction (GBLUP) with complete WGS data, the prediction accuracies were 0.208 ± 0.020 (0.181 ± 0.022) for the startle response and 0.272 ± 0.017 (0.307 ± 0.015) for starvation resistance in the female (male) lines. Compared with GBLUP using complete WGS data, both GBLUP and the genomic feature BLUP (GFBLUP) did not improve the prediction accuracy using SNPs preselected from complete WGS data based on the results of genome-wide association studies (GWASs) or transcriptome-wide association studies (TWASs). Furthermore, by using SNPs preselected from the WGS data based on the results of the expression quantitative trait locus (eQTL) mapping of all genes, only the startle response had greater accuracy than GBLUP with the complete WGS data. The best accuracy values in the female and male lines were 0.243 ± 0.020 and 0.220 ± 0.022, respectively. Importantly, by using SNPs preselected based on the results of the eQTL mapping of significant genes from TWAS, both GBLUP and GFBLUP resulted in great accuracy and small bias of genomic prediction. Compared with the GBLUP using complete WGS data, the best accuracy values represented increases of 60.66% and 39.09% for the starvation resistance and 27.40% and 35.36% for startle response in the female and male lines, respectively.ConclusionsOverall, multi-omics data can assist genomic feature preselection and improve the performance of genomic prediction. The new knowledge gained from this study will enrich the use of multi-omics in genomic prediction.

Highlights

  • Genomic prediction, known as genomic selection (GS), was initially proposed in 2001 [1] and is a statistical method to predict the yet-to-be observed phenotypes or unobserved genetic values of complex traits based on genomic data

  • Higher accuracy of genomic prediction was not achieved for Drosophila using real whole-genome sequencing (WGS) data [5], and similar results were found for livestock using imputed WGS data [6,7,8]

  • Our results provide useful knowledge about preselected genomic features based on multi-omics data and improve the predictive ability of genomic predictions for complex traits

Read more

Summary

Introduction

Known as genomic selection (GS), was initially proposed in 2001 [1] and is a statistical method to predict the yet-to-be observed phenotypes or unobserved genetic values of complex traits based on genomic data. Many preselection variant strategies were used to improve the power of genomic prediction based on the following methods: genome-wide association study (GWAS) [8, 10,11,12], Bayesian procedures [13], genome-wide signatures of selection [14], QTL regions in Animal QTLdb [12], gene annotation [15, 16], and gene ontology categories [17, 18] These methods mainly depend on the direct link between phenotype and DNA variants or some prior genome annotation information. Simultaneously using whole-genome sequencing (WGS) and gene expression level data, four strategies for single-nucleotide polymorphism (SNP) preselection were investigated for genomic predictions in the Drosophila Genetic Reference Panel

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call