Abstract

Gene expression profiling plays an important role in a broad range of areas in biology. The raw gene expression data, may contain missing values. It is an important preprocessing step to accurately estimate missing values in microarray data, because complete datasets are required in numerous expression profile analysis. Numerous methods have been developed to deal with missing values. In this paper, a new and robust method based on fuzzy clustering and gene ontology is proposed to estimate missing values in microarray data. In the proposed method, missing values are imputed with values generated from cluster centers. To determine the similar genes in clustering process, we have utilized the biological knowledge obtained from gene ontology as well as gene expression values. We have applied the proposed method on yeast cell cycle data and yeast environmental stress data, with different percentage of missing entries. We compared the estimation accuracy of our method with some other methods. The experimental results indicate that the proposed method outperforms other methods in terms of accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call