Abstract

Methods of estimating the local false discovery rate (LFDR) have been applied to different types of datasets such as high-throughput biological data, diffusion tensor imaging (DTI), and genome-wide association (GWA) studies. We present a model for LFDR estimation that incorporates a covariate into each test. Incorporating the covariates may improve the performance of testing procedures, because it contains additional information based on the biological context of the corresponding test. This method provides different estimates depending on a tuning parameter. We estimate the optimal value of that parameter by choosing the one that minimizes the estimated LFDR resulting from the bias and variance in a bootstrap approach. This estimation method is called an adaptive reference class (ARC) method. In this study, we consider the performance of ARC method under certain assumptions on the prior probability of each hypothesis test as a function of the covariate. We prove that, under these assumptions, the ARC method has a mean squared error asymptotically no greater than that of the other method where the entire set of hypotheses is used and assuming a large covariate effect. In addition, we conduct a simulation study to evaluate the performance of estimator associated with the ARC method for a finite number of hypotheses. Here, we apply the proposed method to coronary artery disease (CAD) data taken from a GWA study and diffusion tensor imaging (DTI) data.

Highlights

  • Methods of estimating the local false discovery rate (LFDR) [1], not suffering from the bias inherent in estimating other false discovery rates [2], have been applied to various datasets such as high-throughput biological data, diffusion tensor imaging (DTI), and genome-wide association (GWA) study [3,4,5]

  • We present an application of the adaptive reference class (ARC) method on both coronary artery disease (CAD) data and DTI data in Section 3.2 in order to demonstrate the practical importance of deciding between the ARC and combined reference class (CRC) methods

  • In the case where the prior probability π0(Xi) is the step function given in Eq (16), Theorem 1 states that the ARC method has an mean squared error (MSE) asymptotically no greater than that of the CRC method

Read more

Summary

Introduction

Methods of estimating the local false discovery rate (LFDR) [1], not suffering from the bias inherent in estimating other false discovery rates [2], have been applied to various datasets such as high-throughput biological data (e.g., gene expression, proteomics, and metabolomics), diffusion tensor imaging (DTI), and genome-wide association (GWA) study [3,4,5]. In a GWA study, the methods of estimating the LFDR are used in order to estimate the probability that a single nucleotide polymorphism (SNP) is associated with a disease. The local false discovery rate estimated via a bootstrap solution to the reference class problem

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call