PACK: Profile Analysis using Clustering and Kurtosis to find molecular classifiers in cancer

Andrew E Teschendorff,Carlos Caldas,Ali Naderi,Nuno L Barbosa-Morais

doi:10.1093/bioinformatics/btl174

Andrew E Teschendorff, Carlos Caldas + Show 2 more

Open Access

https://doi.org/10.1093/bioinformatics/btl174

Copy DOI

Journal: Bioinformatics	Publication Date: May 8, 2006
Citations: 86	License type: public-domain

Affiliation: University of Cambridge, University of Lisbon

Abstract

Elucidating the molecular taxonomy of cancers and finding biological and clinical markers from microarray experiments is problematic due to the large number of variables being measured. Feature selection methods that can identify relevant classifiers or that can remove likely false positives prior to supervised analysis are therefore desirable. We present a novel feature selection procedure based on a mixture model and a non-gaussianity measure of a gene's expression profile. The method can be used to find genes that define either small outlier subgroups or major subdivisions, depending on the sign of kurtosis. The method can also be used as a filtering step, prior to supervised analysis, in order to reduce the false discovery rate. We validate our methodology using six independent datasets by rediscovering major classifiers in ER negative and ER positive breast cancer and in prostate cancer. Furthermore, our method finds two novel subtypes within the basal subgroup of ER negative breast tumours, associated with apoptotic and immune response functions respectively, and with statistically different clinical outcome. An R-function pack that implements the methods used here has been added to vabayelMix, available from (www.cran.r-project.org). aet21@cam.ac.uk Supplementary information is available at Bioinformatics online.

Full Text