Bayes multiple decision functions.

Wensong Wu,Edsel A Peña

doi:10.1214/13-ejs813

Abstract

This paper deals with the problem of simultaneously making many (M) binary decisions based on one realization of a random data matrix X. M is typically large and X will usually have M rows associated with each of the M decisions to make, but for each row the data may be low dimensional. Such problems arise in many practical areas such as the biological and medical sciences, where the available dataset is from microarrays or other high-throughput technology and with the goal being to decide which among of many genes are relevant with respect to some phenotype of interest; in the engineering and reliability sciences; in astronomy; in education; and in business. A Bayesian decision-theoretic approach to this problem is implemented with the overall loss function being a cost-weighted linear combination of Type I and Type II loss functions. The class of loss functions considered allows for use of the false discovery rate (FDR), false nondiscovery rate (FNR), and missed discovery rate (MDR) in assessing the quality of decision. Through this Bayesian paradigm, the Bayes multiple decision function (BMDF) is derived and an efficient algorithm to obtain the optimal Bayes action is described. In contrast to many works in the literature where the rows of the matrix X are assumed to be stochastically independent, we allow a dependent data structure with the associations obtained through a class of frailty-induced Archimedean copulas. In particular, non-Gaussian dependent data structure, which is typical with failure-time data, can be entertained. The numerical implementation of the determination of the Bayes optimal action is facilitated through sequential Monte Carlo techniques. The theory developed could also be extended to the problem of multiple hypotheses testing, multiple classification and prediction, and high-dimensional variable selection. The proposed procedure is illustrated for the simple versus simple hypotheses setting and for the composite hypotheses setting through simulation studies. The procedure is also applied to a subset of a microarray data set from a colon cancer study.

Highlights

The advent of computer-automated high-throughput data-gathering technology, epitomized by the microarray, has led to the generation of so-called “large M, small n” data sets, which are those characterized by a large number, M, of variables, which are observed or measured on a relatively small number, n, of subjects or units
CONCLUDING REMARKS Bayes multiple decision function (BMDF) defines a class of multiple decision procedures that achieve optimality in the Bayesian framework associated with a general class of loss functions
The results in Theorem 1 describe the form of the BMDF and provide an efficient algorithm of finding it in multiple testing settings

Summary

INTRODUCTION

The advent of computer-automated high-throughput data-gathering technology, epitomized by the microarray, has led to the generation of so-called “large M , small n” data sets, which are those characterized by a large number, M , of variables (hereon called genes for historical reasons), which are observed or measured on a relatively small number, n, of subjects or units. In addition to this decision-theoretic framework, we implement a Bayesian approach to decision-making by putting a prior probability on the unknown state of reality θ. Another major contribution is an efficient algorithm for computationally finding the Bayes multiple decision action, an algorithm that has computational order of at most O(M 2 log M ).

Multiple Decision Problem

Multiple Testing Problems

FP and FN Loss Functions

FDP and FNP Loss Function Consider the loss function

FDP and MDP Loss Functions Consider the loss function

DEPENDENT DATA STRUCTURE

Simple Hypotheses

Composite Hypotheses

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Electronic Journal of Statistics	Publication Date: Jan 1, 2013
Citations: 10	License type: cc-by

R Discovery Prime

R Discovery Prime

Bayes multiple decision functions.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronic Journal of Statistics

Lead the way for us

Similar Papers

False discovery and false nondiscovery rates in single-step multiple testing procedures
Sanat K Sarkar
The Annals of Statistics | VOL. 34
Sanat K SarkarSanat K Sarkar
01 Feb 2006
The Annals of Statistics | VOL. 34

Optimal False Discovery Rate Control with Kernel Density Estimation in a Microarray Experiment
Moonsu Kang
Communications in Statistics - Simulation and Computation | VOL. 45
Moonsu KangMoonsu Kang
30 Oct 2015
Communications in Statistics - Simulation and Computation | VOL. 45

The false discovery rate for statistical pattern recognition
Clayton Scott ... Rebecca Willett
Electronic Journal of Statistics | VOL. 3
Clayton Scott, et. al.Clayton Scott ... Rebecca Willett
01 Jan 2009
Electronic Journal of Statistics | VOL. 3

Generalization Error Analysis for FDR Controlled Classification
Clayton Scott ... Rebecca Willett
-
Clayton Scott, et. al.Clayton Scott ... Rebecca Willett
01 Aug 2007
01 Aug 2007

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Bayes multiple decision functions.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronic Journal of Statistics