Abstract

BackgroundIn the analysis of large-scale genomic datasets, an important consideration is the power of analytical methods to identify accurate predictive models of disease. When trying to assess sensitivity from such analytical methods, a confounding factor up to this point has been the presence of linkage disequilibrium (LD). In this study, we examined the effect of LD on the sensitivity of the Multifactor Dimensionality Reduction (MDR) software package.ResultsFour relative amounts of LD were simulated in multiple one- and two-locus scenarios for which the position of the functional SNP(s) within LD blocks varied. Simulated data was analyzed with MDR to determine the sensitivity of the method in different contexts, where the sensitivity of the method was gauged as the number of times out of 100 that the method identifies the correct one- or two-locus model as the best overall model. As the amount of LD increases, the sensitivity of MDR to detect the correct functional SNP drops but the sensitivity to detect the disease signal and find an indirect association increases.ConclusionsHigher levels of LD begin to confound the MDR algorithm and lead to a drop in sensitivity with respect to the identification of a direct association; it does not, however, affect the ability to detect indirect association. Careful examination of the solution models generated by MDR reveals that MDR can identify loci in the correct LD block; though it is not always the functional SNP. As such, the results of MDR analysis in datasets with LD should be carefully examined to consider the underlying LD structure of the dataset.

Highlights

  • Linkage disequilibrium (LD) is defined as the nonrandom association of alleles at two or more loci [1]

  • The inaccuracies that detracted from the sensitivity scores in Multifactor Dimensionality Reduction (MDR) were due to two-locus models being chosen in place of a one-locus model which was not counted towards detection sensitivity even if the functional locus was in this model

  • In drawing conclusions from the research presented in this paper, we wish to make recommendations about the future use of MDR in performing gene-gene interaction analysis in data with significant amounts of LD among the single nucleotide polymorphisms (SNPs)

Read more

Summary

Introduction

Linkage disequilibrium (LD) is defined as the nonrandom association of alleles at two or more loci [1]. The problem with LD in genomic data and its ability to confound analysis is illustrated by the human leukocyte antigen (HLA) locus, which was at the heart of several early spurious associations with susceptibility to immunological and infectious diseases as a result of 3 cM of high LD around the locus [2]. This example shows how long-range LD can confound analysis methods in their attempt to precisely identify loci associated with risk for disease. We examined the effect of LD on the sensitivity of the Multifactor Dimensionality Reduction (MDR) software package

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.