Abstract

Gene selection and sample classification based on gene expression data are important research areas in bioinformatics. Selecting important genes closely related to classification is a challenging task due to high dimensionality and small sample size of microarray data. Extended rough set based on neighborhood has been successfully applied to gene selection, as it can select attributes without redundancy and deal with numerical attributes directly. However, the computation of approximations in rough set is extremely time consuming. In this paper, in order to accelerate the process of gene selection, a parallel computation method is proposed to calculate approximations of intersection neighborhood rough set. Furthermore, a novel dynamic ensemble pruning approach based on Affinity Propagation clustering and dynamic pruning framework is proposed to reduce memory usage and computational cost. Experimental results on three Arabidopsis thaliana biotic and abiotic stress response datasets demonstrate that the proposed method can obtain better classification performance than ensemble method with gene pre-selection.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call