Abstract

Class discovery from gene expression data is an important task for cancer diagnosis. In this paper, we present a new framework for class discovery. The new framework integrates the perturbation technique, the cluster ensemble approach, and the cluster validity index. Specifically, it first generates a set of perturbed datasets from the original microarray data. Then, the Neural Gas, which serves as the basic clustering algorithm, is applied to obtain the partitions from the original dataset and the perturbed datasets. Finally, a new cluster validity index called disagreement/agreement (DA) index (DAI) is designed to identify the number of classes in the dataset by considering the difference between the partition obtained from the original dataset and the partitions obtained from the perturbed datasets. The experiments in three synthetic datasets and four cancer datasets show that: 1) DAI successfully discovers the underlying structure from all the synthetic datasets and most of the cancer datasets and 2) DAI outperforms most of the state-of-the-art cluster validity indexes when applied to gene expression data.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.