Clustering and classification based on the L1 data depth

R Jornsten

doi:10.1016/s0047-259x(04)00027-2

Abstract

Clustering and classification are important tasks for the analysis of microarray gene expression data. Classification of tissue samples can be a valuable diagnostic tool for diseases such as cancer. Clustering samples or experiments may lead to the discovery of subclasses of diseases. Clustering genes can help identify groups of genes that respond similarly to a set of experimental conditions. We also need validation tools for clustering and classification. Here, we focus on the identification of outliers—units that may have been misallocated, or mislabeled, or are not representative of the classes or clusters. We present two new methods: DDclust and DDclass, for clustering and classification. These non-parametric methods are based on the intuitively simple concept of data depth. We apply the methods to several gene expression and simulated data sets. We also discuss a convenient visualization and validation tool—the relative data depth plot.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Clustering and classification based on the L1 data depth

Abstract

Talk to us

Similar Papers

More From: Journal of Multivariate Analysis

Lead the way for us

Journal: Journal of Multivariate Analysis	Publication Date: Jul 1, 2004
Citations: 1

Similar Papers

Clustering and classification based on the L 1 data depth
Rebecka Jörnsten
Journal of Multivariate Analysis | VOL. 90
Rebecka JörnstenRebecka Jörnsten
16 Apr 2004
Journal of Multivariate Analysis | VOL. 90

Comparative Analysis of Different Label-Free Mass Spectrometry Based Protein Abundance Estimates and Their Correlation with RNA-Seq Gene Expression Data
Kang Ning ... Damian Fermin
Journal of Proteome Research | VOL. 11
Kang Ning, et. al.Kang Ning ... Damian Fermin
29 Feb 2012
Journal of Proteome Research | VOL. 11

EXP-PAC: Providing comparative analysis and storage of next generation gene expression data
Philip C Church ... Christophe Lefèvre
Genomics | VOL. 100
Philip C Church, et. al.Philip C Church ... Christophe Lefèvre
15 May 2012
Genomics | VOL. 100

An Integrative Analysis of Network Motifs and Gene Expression Data to Discover Experimentally Testable Transcription Factor-miRNA-Gene Regulatory Loops In Multiple Myeloma
Samir B Amin ... Cheng Li
Blood | VOL. 116
Samir B Amin, et. al.Samir B Amin ... Cheng Li
19 Nov 2010
Blood | VOL. 116

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Clustering and classification based on the L1 data depth

Abstract

Talk to us

Similar Papers

More From: Journal of Multivariate Analysis