An adaptive threshold determination method of feature screening for genomic selection

Guifang Fu,Xiaotian Dai,Gang Wang

doi:10.1186/s12859-017-1617-9

Abstract

BackgroundAlthough the dimension of the entire genome can be extremely large, only a parsimonious set of influential SNPs are correlated with a particular complex trait and are important to the prediction of the trait. Efficiently and accurately selecting these influential SNPs from millions of candidates is in high demand, but poses challenges. We propose a backward elimination iterative distance correlation (BE-IDC) procedure to select the smallest subset of SNPs that guarantees sufficient prediction accuracy, while also solving the unclear threshold issue for traditional feature screening approaches.ResultsVerified through six simulations, the adaptive threshold estimated by the BE-IDC performed uniformly better than fixed threshold methods that have been used in the current literature. We also applied BE-IDC to an Arabidopsis thaliana genome-wide data. Out of 216,130 SNPs, BE-IDC selected four influential SNPs, and confirmed the same FRIGIDA gene that was reported by two other traditional methods.ConclusionsBE-IDC accommodates both the prediction accuracy and the computational speed that are highly demanded in the genomic selection.

Highlights

The dimension of the entire genome can be extremely large, only a parsimonious set of influential single nucleotide polymorphisms (SNPs) are correlated with a particular complex trait and are important to the prediction of the trait
We demonstrate that the backward elimination iterative distance correlation (BE-IDC) approach selects a very small set of SNPs for Arabidopsis thaliana data
Unless a very small number of SNPs is preferred for reason of saving experimental cost in breeding or disease diagnosis applications, we suggest taking the threshold to be that for which the mean square prediction error (MSPE) is minimized

Summary

Introduction

The dimension of the entire genome can be extremely large, only a parsimonious set of influential SNPs are correlated with a particular complex trait and are important to the prediction of the trait. Genomic selection is improved by identifying a small subset of influential single nucleotide polymorphisms (SNPs) from high-dimensional genetic information to efficiently predict individual’s phenotype [1,2,3,4,5]. Li et al developed a distance correlation based sure independence feature screening (DC-SIS) strategy that defines an association strength measure for each feature based on its distance correlation with the phenotype [16]. The idea of DC-SIS is to theoretically satisfies the sure screening property, ranks the features from the most important to the least important by decreasing distance correlation values, and filters the majority of noise with low values of the defined association strength measure.

Objectives

Methods

Results

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Apr 12, 2017
Citations: 4	License type: open-access

R Discovery Prime

R Discovery Prime

An adaptive threshold determination method of feature screening for genomic selection

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Editorial [Hot Topic: Genetics Dissection of Complex Traits in the Genomic Era (Guest Editor: Bernardo Ordas)
Bernardo Ordas
Current Genomics | VOL. 13
Bernardo OrdasBernardo Ordas
01 Apr 2012
Current Genomics | VOL. 13

Genomic assisted selection for enhancing line breeding: merging genomic and phenotypic selection in winter wheat breeding programs with preliminary yield trials
Sebastian Michel ... Heinrich Grausgruber
Theoretical and Applied Genetics | VOL. 130
Sebastian Michel, et. al.Sebastian Michel ... Heinrich Grausgruber
08 Nov 2016
Theoretical and Applied Genetics | VOL. 130

Accuracy of Genomic Prediction in Dairy Cattle
Malena Erbe
-
Malena ErbeMalena Erbe
20 Feb 2022
20 Feb 2022

Genomic and pedigree\u2010based predictive ability for quality traits in tea (Camellia sinensis (L.) O. Kuntze)
Nelson Lubanga ... Festo Massawe
Euphytica | VOL. 217
Nelson Lubanga, et. al.Nelson Lubanga ... Festo Massawe
09 Feb 2021
Genomic and pedigree\u2010based predictive ability for quality traits in tea (Camellia sinensis (L.) O. Kuntze)
Nelson Lubanga ... Festo Massawe

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An adaptive threshold determination method of feature screening for genomic selection

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics