Abstract

Background: Gene ontology (GO) is a well-structured knowledge of biological terms that describes roles of genes and their products in a standardized and organized controlled vocabulary format. Over the last decade, many measures are developed to exploit GO advantages to determine semantic similarities between biological entities. Using GO ontologies, there are some constraints that existing GO-based semantic similarity measures try to address them. For instance, (1) edges in a GO graph, do not indicate uniform distances and also have different densities, and (2) ignoring term levels in an ontology makes “shallow annotation” drawback, i.e., two terms with a certain distance near the root of GO graph have equal semantic similarity with two terms with the same distance but far from the root. Methods: Here, we present wAIC, a two-stage hybrid semantic similarity measure using weighted aggregation of information contents. In wAIC, the impact of each common ancestor on semantic similarity value is determined according to the location of the ancestor in the ontology graph. wAIC, also, filters (from annotating term set) terms that are in upper levels of the graph ontology to reduce shallow annotation constraints. Results: Experimental results confirm that the proposed measure is more consistent with major related constraints, such that, wAIC semantic similarity values have more correlation with both sequence similarity values and gene expression based similarity values than state-of-the-art semantic similarity measures. Conclusions: WAIC show using a weighted aggregation of common ancestors is completely consistent with the human perception and can improve accuracy of gene similarity measurement.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call