서픽스트리 클러스터링 방법과 블라스트를 통합한 유전자 서열의 클러스터링과 기능검색에 관한 연구

Sangil Han ,Ju-Yeong Lee ,Sung Koo Lee ,Kyung Hwan Kim ,Young Ho Kim ,Kyu Hong Hwang

doi:10.5302/j.icros.2005.11.10.851

Abstract

The DNA and protein data of diverse species have been daily discovered and deposited in the public archives according to each established format. Database systems in the public archives provide not only an easy-to-use, flexible interface to the public, but also in silico analysis tools of unidentified sequence data. Of such in silico analysis tools, multiple sequence alignment [1] methods relying on pairwise alignment and Smith-Waterman algorithm [2] enable us to identify unknown DNA, protein sequences or phylogenetic relation among several species. However, in the existing multiple alignment method as the number of sequences increases, the runtime increases exponentially. In order to remedy this problem, we adopted a parallel processing suffix tree algorithm that is able to search for common subsequences at one time without pairwise alignment. Also, the cross-matching subsequences triggering inexact-matching among the searched common subsequences might be produced. So, the cross-matching masking process was suggested in this paper. To identify the function of the clusters generated by suffix tree clustering, BLAST was combined with a clustering tool. Our clustering and annotating tool is summarized as the following steps: (1) construction of suffix tree; (2) masking of cross-matching pairs; (3) clustering of gene sequences and (4) annotating gene clusters by BLAST search. The system was successfully evaluated with 22 gene sequences in the pyrubate pathway of bacteria, clustering 7 clusters and finding out representative common subsequences of each cluster

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

서픽스트리 클러스터링 방법과 블라스트를 통합한 유전자 서열의 클러스터링과 기능검색에 관한 연구

Abstract

Talk to us

Similar Papers

More From: Journal of Control, Automation and Systems Engineering

Lead the way for us

Similar Papers

A gene clustering method with masking cross-matching fragments using modified suffix tree clustering method
Sang Il Han ... Sung Gun Lee
Korean Journal of Chemical Engineering | VOL. 22
Sang Il Han, et. al.Sang Il Han ... Sung Gun Lee
01 May 2005
Korean Journal of Chemical Engineering | VOL. 22

기능 도메인 예측을 위한 유전자 서열 클러스터링
...
Journal of Control, Automation and Systems Engineering | VOL. 12
, et. al. ...
01 Oct 2006
Journal of Control, Automation and Systems Engineering | VOL. 12

CLAGen: A tool for clustering and annotating gene sequences using a suffix tree algorithm
Sang Il Han ... Kyu Suk Hwang
BioSystems | VOL. 84
Sang Il Han, et. al.Sang Il Han ... Kyu Suk Hwang
27 Dec 2005
BioSystems | VOL. 84

Gene Sequences Clustering and Identifying Functional Domain Using a Suffix Tree Algorithm
Sang Han ... Sung Lee
-
Sang Han, et. al.Sang Han ... Sung Lee
01 Jan 2006
01 Jan 2006

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

서픽스트리 클러스터링 방법과 블라스트를 통합한 유전자 서열의 클러스터링과 기능검색에 관한 연구

Abstract

Talk to us

Similar Papers

More From: Journal of Control, Automation and Systems Engineering