Association Plots: visualizing cluster-specific associations in high-dimensional correspondence analysis biplots

Elzbieta Gralinska,Martin Vingron

doi:10.1093/jrsssc/qlad039

Abstract

Abstract In molecular biology, just as in many other fields of science, data often come in the form of matrices or contingency tables with many observations (rows) for a set of variables (columns). While projection methods like principal component analysis or correspondence analysis (CA) can be applied for obtaining an overview of such data, in cases where the matrix is very large the associated loss of information upon projection into two or three dimensions may be dramatic. However, when the set of variables can be grouped into clusters, this opens up a new angle on the data. We focus on the question of which observations are associated to a cluster and distinguish it from other clusters. CA employs a geometry geared towards answering this question. We exploit this feature in order to introduce Association Plots for visualizing cluster-specific observations in complex data. Regardless of the data matrix dimensionality Association Plots are two-dimensional and depict the observations associated to a cluster of variables. We demonstrate our method on two small data sets and then use it to study a challenging genomic data set comprising &gt;10,000 samples. We show that Association Plots can clearly highlight those observations which characterise a cluster of variables.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of the Royal Statistical Society: Series C (Applied Statistics)	Publication Date: Jun 8, 2023
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Association Plots: visualizing cluster-specific associations in high-dimensional correspondence analysis biplots

Abstract

Talk to us

Similar Papers

More From: Journal of the Royal Statistical Society: Series C (Applied Statistics)

Lead the way for us

Similar Papers

Fast Principal Component Analysis using Eigenspace Merging
Liang Liu ... Tieniu Tan
-
Liang Liu, et. al.Liang Liu ... Tieniu Tan
01 Jan 2007
01 Jan 2007

A new imputation method for small software project data sets
Qinbao Song ... Martin Shepperd
The Journal of Systems & Software | VOL. 80
Qinbao Song, et. al.Qinbao Song ... Martin Shepperd
16 Jun 2006
The Journal of Systems & Software | VOL. 80

Transfer learning-based fault location with small datasets in VSC-HVDC
Boyang Shang ... Jiaxin Hei
International Journal of Electrical Power and Energy Systems | VOL. 151
Boyang Shang, et. al.Boyang Shang ... Jiaxin Hei
13 Apr 2023
International Journal of Electrical Power and Energy Systems | VOL. 151

Abstract LB396: The power of NetraAI: Precision medicine in oncology through sub-insight learning from small data sets
Bessi Qorri ... Joseph Geraci
American Journal of Cancer | VOL. 84
Bessi Qorri, et. al.Bessi Qorri ... Joseph Geraci
05 Apr 2024
American Journal of Cancer | VOL. 84

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Association Plots: visualizing cluster-specific associations in high-dimensional correspondence analysis biplots

Abstract

Talk to us

Similar Papers

More From: Journal of the Royal Statistical Society: Series C (Applied Statistics)