Abstract
BackgroundStudies to detect associations between DNA markers and traits of interest in humans and livestock benefit from increasing the number of individuals genotyped. Performing association studies on pooled DNA samples can provide greater power for a given cost. For quantitative traits, the effect of an SNP is measured in the units of the trait and here we propose and demonstrate a method to estimate SNP effects on quantitative traits from pooled DNA data.MethodsTo obtain estimates of SNP effects from pooled DNA samples, we used logistic regression of estimated allele frequencies in pools on phenotype. The method was tested on a simulated dataset, and a beef cattle dataset using a model that included principal components from a genomic correlation matrix derived from the allele frequencies estimated from the pooled samples. The performance of the obtained estimates was evaluated by comparison with estimates obtained using regression of phenotype on genotype from individual samples of DNA.ResultsFor the simulated data, the estimates of SNP effects from pooled DNA are similar but asymptotically different to those from individual DNA data. Error in estimating allele frequencies had a large effect on the accuracy of estimated SNP effects. For the beef cattle dataset, the principal components of the genomic correlation matrix from pooled DNA were consistent with known breed groups, and could be used to account for population stratification. Correctly modeling the contemporary group structure was essential to achieve estimates similar to those from individual DNA data, and pooling DNA from individuals within groups was superior to pooling DNA across groups. For a fixed number of assays, pooled DNA samples produced results that were more correlated with results from individual genotyping data than were results from one random individual assayed from each pool.ConclusionsUse of logistic regression of allele frequency on phenotype makes it possible to estimate SNP effects on quantitative traits from pooled DNA samples. With pooled DNA samples, genotyping costs are reduced, and in cases where trait records are abundant this approach is promising to obtain SNP associations for marker-assisted selection.
Highlights
Studies to detect associations between DNA markers and traits of interest in humans and livestock benefit from increasing the number of individuals genotyped
For pooled data, we evaluated the logistic regression (LR) of genotype on phenotype using the significance of estimated single nucleotide polymorphism (SNP) effects (expressed as minus log(p-value), abbreviated here by MLP) and the magnitude of the estimated SNP effect a, estimated using equation (1)
Where the data are referred to as “pooled” it is implicit that the analysis method is LR, even when the number of pools is equal to the number of individuals
Summary
Studies to detect associations between DNA markers and traits of interest in humans and livestock benefit from increasing the number of individuals genotyped. Recent studies on quantitative traits in humans and livestock suggest that, for many traits, genetic variation is mainly due to a large number of regions in the genome, each having a small effect [1,2]. This has led to the conclusion that much larger numbers. Marker effects on the quantitative trait can be estimated from the allele frequencies in the selected DNA pools [9]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.