FLAME, a novel fuzzy clustering method for the analysis of DNA microarray data.

Limin Fu,Enzo Medico

doi:10.1186/1471-2105-8-3

Limin Fu, Enzo Medico

Open Access

https://doi.org/10.1186/1471-2105-8-3

Copy DOI

Journal: BMC Bioinformatics	Publication Date: Jan 4, 2007
Citations: 484	License type: CC BY 2.0

Affiliation: University of Turin, Candiolo Cancer Institute

Abstract

BackgroundData clustering analysis has been extensively applied to extract information from gene expression profiles obtained with DNA microarrays. To this aim, existing clustering approaches, mainly developed in computer science, have been adapted to microarray data analysis. However, previous studies revealed that microarray datasets have very diverse structures, some of which may not be correctly captured by current clustering methods. We therefore approached the problem from a new starting point, and developed a clustering algorithm designed to capture dataset-specific structures at the beginning of the process.ResultsThe clustering algorithm is named Fuzzy clustering by Local Approximation of MEmbership (FLAME). Distinctive elements of FLAME are: (i) definition of the neighborhood of each object (gene or sample) and identification of objects with "archetypal" features named Cluster Supporting Objects, around which to construct the clusters; (ii) assignment to each object of a fuzzy membership vector approximated from the memberships of its neighboring objects, by an iterative converging process in which membership spreads from the Cluster Supporting Objects through their neighbors. Comparative analysis with K-means, hierarchical, fuzzy C-means and fuzzy self-organizing maps (SOM) showed that data partitions generated by FLAME are not superimposable to those of other methods and, although different types of datasets are better partitioned by different algorithms, FLAME displays the best overall performance. FLAME is implemented, together with all the above-mentioned algorithms, in a C++ software with graphical interface for Linux and Windows, capable of handling very large datasets, named Gene Expression Data Analysis Studio (GEDAS), freely available under GNU General Public License.ConclusionThe FLAME algorithm has intrinsic advantages, such as the ability to capture non-linear relationships and non-globular clusters, the automated definition of the number of clusters, and the identification of cluster outliers, i.e. genes that are not assigned to any cluster. As a result, clusters are more internally homogeneous and more diverse from each other, and provide better partitioning of biological functions. The clustering algorithm can be easily extended to applications different from gene expression analysis.

Highlights

Data clustering analysis has been extensively applied to extract information from gene expression profiles obtained with DNA microarrays
The first is the extraction of local structure information and identification of cluster supporting objects (CSO's)
Each object is assigned with equal membership to all clusters, with the exception of Cluster Supporting Objects (CSOs) and outlier objects, each CSO being assigned with full membership to itself as a cluster, and all outlier objects being assigned with a full membership to the outlier group

Summary

Introduction

Data clustering analysis has been extensively applied to extract information from gene expression profiles obtained with DNA microarrays. Since the work of Eisen and colleagues [1], clustering methods have become a key step in microarray data analysis, to identify groups of genes or samples displaying a similar expression profile. None of the existing clustering algorithms performs significantly better than the others when tested across multiple datasets [4,5,6] Used algorithms, such as k-means, hierarchical clustering and Self-Organizing Maps (SOM) [7], typically construct clusters on the basis of pairwise distance between genes. As a consequence, they may fail to reveal nonlinear relationships between gene expression profiles, and thereby fail to correctly represent a dataset with nonlinear structure [8]. Hierarchical clustering remains the most widely used clustering algorithm, it has been described to suffer from a number of limitations mostly deriving from the local decision making scheme that joins the two closest genes or clusters without considering the data as a whole [12]

Objectives

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

FLAME, a novel fuzzy clustering method for the analysis of DNA microarray data.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Analysis of DNA microarray data using self-organizing map and kernel based clustering
M Kotani ... S Ozawa
-
M Kotani, et. al.M Kotani ... S Ozawa
18 Nov 2002
18 Nov 2002

Reproducible Clusters from Microarray Research: Whither?
Nikhil R Garge ... Bernard S Gorman
BMC Bioinformatics | VOL. 6
Nikhil R Garge, et. al.Nikhil R Garge ... Bernard S Gorman
01 Jul 2005
BMC Bioinformatics | VOL. 6

A Fuzzy Clustering Algorithm for Analysis of Gene Expression Profiles
Han-Saem Park ... Sung-Bae Cho
-
Han-Saem Park, et. al.Han-Saem Park ... Sung-Bae Cho
01 Jan 2004
01 Jan 2004

Randomized maps for assessing the reliability of patients clusters in DNA microarray data analyses
Alberto Bertoni ... Giorgio Valentini
Artificial Intelligence in Medicine | VOL. 37
Alberto Bertoni, et. al.Alberto Bertoni ... Giorgio Valentini
23 May 2006
Artificial Intelligence in Medicine | VOL. 37

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

FLAME, a novel fuzzy clustering method for the analysis of DNA microarray data.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics