Bayesian bi-clustering methods with applications in computational biology

Han Yan,Jiexing Wu,Yang Li,Jun S Liu

doi:10.1214/22-aoas1622

Abstract

Bi-clustering is a useful approach in analyzing large biological data sets when the observations come from heterogeneous groups and have a large number of features. We outline a general Bayesian approach in tackling bi-clustering problems in moderate to high dimensions and propose three Bayesian bi-clustering models on categorical data which increase in complexities in their modeling of the distributions of features across bi-clusters. Our proposed methods apply to a wide range of scenarios: from situations where data are cluster-distinguishable only among a small subset of features but masked by a large amount of noise to situations where different groups of data are identified by different sets of features or data exhibit hierarchical structures. Through simulation studies we show that our methods outperform existing (bi-)clustering methods in both identifying clusters and recovering feature distributional patterns across bi-clusters. We further apply the developed approaches to a human genetic dataset, a human single-cell genomic dataset, and a collection of 1774 mouse genomic datasets with a focus on 58 genes from two pathways.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Bayesian bi-clustering methods with applications in computational biology

Abstract

Talk to us

Similar Papers

More From: The Annals of Applied Statistics

Lead the way for us

Journal: The Annals of Applied Statistics	Publication Date: Dec 1, 2022
Citations: 1

Similar Papers

Lecture on Progress toward Petascale Applications in Bioinformatics and Computational Biology
C.A Stewart ... D Bader
-
C.A Stewart, et. al.C.A Stewart ... D Bader
01 Oct 2007
01 Oct 2007

Outlier detection using flexible categorization and interrogative agendas
Marcel Boersma ... Nachoem Wijnberg
Decision Support Systems | VOL. 180
Marcel Boersma, et. al.Marcel Boersma ... Nachoem Wijnberg
19 Feb 2024
Decision Support Systems | VOL. 180

A Divide and Conquer Feature Reduction and Feature Selection Algorithm in KDD Intrusion Detection Dataset
A Das ... R.B Nayak
-
A Das, et. al.A Das ... R.B Nayak
01 Jan 2012
01 Jan 2012

Computational biology on parallel computers
Srinivas Aluru
-
Srinivas AluruSrinivas Aluru
01 Sep 2003
01 Sep 2003

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Bayesian bi-clustering methods with applications in computational biology

Abstract

Talk to us

Similar Papers

More From: The Annals of Applied Statistics