Global Sequence Homology Detection Using Word Conservation Probability

Jae-Seong Yang,Dae-Kyum Kim,Sanguk Kim,Jinho Kim

doi:10.4051/ibc.2011.3.4.0014

Abstract

Protein homology detection is an important issue in comparative genomics. Because of the exponential growth of sequence databases, fast and efficient homology detection tools are urgently needed. Currently, for homology detection, sequence comparison methods using local alignment such as BLAST are generally used as they give a reasonable measure for sequence similarity. However, these methods have drawbacks in offering overall sequence similarity, especially in dealing with eukaryotic genomes that often contain many insertions and duplications on sequences. Also these methods do not provide the explicit models for speciation, thus it is difficult to interpret their similarity measure into homology detection. Here, we present a novel method based on Word Conservation Score (WCS) to address the current limitations of homology detection. Instead of counting each amino acid, we adopted the concept of `Word` to compare sequences. WCS measures overall sequence similarity by comparing word contents, which is much faster than BLAST comparisons. Furthermore, evolutionary distance between homologous sequences could be measured by WCS. Therefore, we expect that sequence comparison with WCS is useful for the multiple-species-comparisons of large genomes. In the performance comparisons on protein structural classifications, our method showed a considerable improvement over BLAST. Our method found bigger micro-syntenic blocks which consist of orthologs with conserved gene order. By testing on various datasets, we showed that WCS gives faster and better overall similarity measure compared to BLAST.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Global Sequence Homology Detection Using Word Conservation Probability

Abstract

Talk to us

Similar Papers

More From: Interdisciplinary Bio Central

Lead the way for us

Similar Papers

Protein homology detection and fold inference through multiple alignment entropy profiles
Alejandro Sánchez‐Flores ... Lorenzo Segovia
Proteins: Structure, Function, and Bioinformatics | VOL. 70
Alejandro Sánchez‐Flores, et. al.Alejandro Sánchez‐Flores ... Lorenzo Segovia
01 Aug 2007
Proteins: Structure, Function, and Bioinformatics | VOL. 70

Physicochemical property distributions for accurate and rapid pairwise protein homology detection
Bobbie-Jo M Webb-Robertson ... Christopher S Oehmen
BMC Bioinformatics | VOL. 11
Bobbie-Jo M Webb-Robertson, et. al.Bobbie-Jo M Webb-Robertson ... Christopher S Oehmen
19 Mar 2010
BMC Bioinformatics | VOL. 11

CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction.
Xuefeng Cui ... Sheng Wang
Bioinformatics | VOL. 32
Xuefeng Cui, et. al.Xuefeng Cui ... Sheng Wang
11 Jun 2016
Bioinformatics | VOL. 32

MRFalign: Protein Homology Detection through Alignment of Markov Random Fields
Jianzhu Ma ... Jinbo Xu
PLoS Computational Biology | VOL. 10
Jianzhu Ma, et. al.Jianzhu Ma ... Jinbo Xu
27 Mar 2014
PLoS Computational Biology | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Global Sequence Homology Detection Using Word Conservation Probability

Abstract

Talk to us

Similar Papers

More From: Interdisciplinary Bio Central