Abstract

Data mining has become an important topic in effective analysis of gene expression data due to its wide application in the biomedical industry. A gene cluster is a set of two or more genes that serve to encode for the same or similar products. Gene clustering is the process of grouping related genes in the same cluster as at the foundation of different genomic studies that aim at analysing the function of genes. Several advanced techniques have been proposed for data clustering and many of them have been applied to gene expression data, with partial success. The goal of gene clustering is to identify important genes and perform cluster discovery on samples. This paper reviews three of the most representative off-line clustering techniques: fuzzy C-means clustering, hierarchical clustering, and mixed C-means clustering. These techniques are implemented and tested against a brain tumour gene expression dataset. The performance of the three techniques is compared based on 'goodness of clustering' evaluation measures and mixed C-means show best performance than the other two clustering techniques for the brain tumour gene expression data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.