Signal Classification for the Integrative Analysis of Multiple Sequences of Large-Scale Multiple Tests

Dongdong Xiang,T Tony Cai,Sihai Dave Zhao

doi:10.1111/rssb.12323

Abstract

SummaryThe integrative analysis of multiple data sets is becoming increasingly important in many fields of research. When the same features are studied in several independent experiments, it can often be useful to analyse jointly the multiple sequences of multiple tests that result. It is frequently necessary to classify each feature into one of several categories, depending on the null and non-null configuration of its corresponding test statistics. The paper studies this signal classification problem, motivated by a range of applications in large-scale genomics. Two new types of misclassification rate are introduced, and two oracle procedures are developed to control each type while also achieving the largest expected number of correct classifications. Corresponding data-driven procedures are also proposed, proved to be asymptotically valid and optimal under certain conditions and shown in numerical experiments to be nearly as powerful as the oracle procedures. In an application to psychiatric genetics, the procedures proposed are used to discover genetic variants that may affect both bipolar disorder and schizophrenia, as well as variants that may help to distinguish between these conditions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Signal Classification for the Integrative Analysis of Multiple Sequences of Large-Scale Multiple Tests

Abstract

Talk to us

Similar Papers

More From: Journal of the Royal Statistical Society Series B: Statistical Methodology

Lead the way for us

Journal: Journal of the Royal Statistical Society Series B: Statistical Methodology	Publication Date: May 20, 2019
Citations: 7

Similar Papers

Signal Classification in Large-Scale Multi-Sequence Integrative Analysis Under the HMM Dependence
Wendong Li ... Peihua Qiu
Technometrics | VOL. 66
Wendong Li, et. al.Wendong Li ... Peihua Qiu
16 Oct 2023
Technometrics | VOL. 66

Hidden Likelihood Support in Genomic Data: Can Forty-Five Wrongs Make a Right?
John Gatesy ... Richard H Baker
Systematic Biology | VOL. 54
John Gatesy, et. al.John Gatesy ... Richard H Baker
01 Jun 2005
Systematic Biology | VOL. 54

Bipolar disorder and chromosome 18: an analysis of multiple data sets.
Judith A Badner ... Lynn R Goldin
Genetic epidemiology | VOL. 14
Judith A Badner, et. al.Judith A Badner ... Lynn R Goldin
01 Jan 1997
Genetic epidemiology | VOL. 14

Optimal Ascertainment Strategies to Detect Linkage to Common Disease Alleles
Miron Baron
The American Journal of Human Genetics | VOL. 64
Miron BaronMiron Baron
01 Apr 1999
The American Journal of Human Genetics | VOL. 64

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Signal Classification for the Integrative Analysis of Multiple Sequences of Large-Scale Multiple Tests

Abstract

Talk to us

Similar Papers

More From: Journal of the Royal Statistical Society Series B: Statistical Methodology