Abstract

BackgroundIn recent years, many measures of gene functional similarity have been proposed and widely used in all kinds of essential research. These methods are mainly divided into two categories: pairwise approaches and group-wise approaches. However, a common problem with these methods is their time consumption, especially when measuring the gene functional similarities of a large number of gene pairs. The problem of computational efficiency for pairwise approaches is even more prominent because they are dependent on the combination of semantic similarity. Therefore, the efficient measurement of gene functional similarity remains a challenging problem.ResultsTo speed current gene functional similarity calculation methods, a novel two-step computing strategy is proposed: (1) establish a hash table for each method to store essential information obtained from the Gene Ontology (GO) graph and (2) measure gene functional similarity based on the corresponding hash table. There is no need to traverse the GO graph repeatedly for each method with the help of the hash table. The analysis of time complexity shows that the computational efficiency of these methods is significantly improved. We also implement a novel Speeding Gene Functional Similarity Calculation tool, namely SGFSC, which is bundled with seven typical measures using our proposed strategy. Further experiments show the great advantage of SGFSC in measuring gene functional similarity on the whole genomic scale.ConclusionsThe proposed strategy is successful in speeding current gene functional similarity calculation methods. SGFSC is an efficient tool that is freely available at http://nclab.hit.edu.cn/SGFSC. The source code of SGFSC can be downloaded from http://pan.baidu.com/s/1dFFmvpZ.

Highlights

  • In recent years, many measures of gene functional similarity have been proposed and widely used in all kinds of essential research

  • Many gene functional similarity calculation measures [15, 17,18,19,20,21,22,23,24,25,26,27,28,29,30] have been proposed and widely used in biology research. They are mainly divided into two categories: pairwise approaches and group-wise approaches, both of which must rely on Gene Ontology (GO) graphs [31]

  • We focus on the semantic similarity between t1 and t2

Read more

Summary

Introduction

Many measures of gene functional similarity have been proposed and widely used in all kinds of essential research These methods are mainly divided into two categories: pairwise approaches and group-wise approaches. Many gene functional similarity calculation measures [15, 17,18,19,20,21,22,23,24,25,26,27,28,29,30] have been proposed and widely used in biology research They are mainly divided into two categories: pairwise approaches and group-wise approaches, both of which must rely on GO graphs [31]. There are three types of approaches for measuring the functional similarities of genes: set, graph and vector [31]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call