This research analyzes the gene relationship according to their annotations. We present here a similar genes discovery system (SGDS), based upon semantic similarity measure of gene ontology (GO) and Entrez gene, to identify groups of similar genes. In order to validate the proposed measure, we analyze the relationships between similarity and expression correlation of pairs of genes. We explore a number of semantic similarity measures and compute the Pearson correlation coefficient. Highly correlated genes exhibit strong similarity in the ontology taxonomies. The results show that our proposed semantic similarity measure outperforms the others and seems better suited for use in GO. We use MAPK homogenous genes group and MAP kinase pathway as benchmarks to tune the parameters in our system for achieving higher accuracy. We applied the SGDS to RON and Lutheran pathways, the results show that it is able to identify a group of similar genes and to predict novel pathways based on a group of candidate genes.
Read full abstract