Frequent Patterns Algorithm of Biological Sequences based on Pattern Prefix-tree

Fei Xie,Linyan Xue,Peng Lin,Shuang Liu,Xiaoke Zhang

doi:10.15837/ijccc.2019.4.3607

Fei Xie, Linyan Xue + Show 3 more

Open Access

https://doi.org/10.15837/ijccc.2019.4.3607

Copy DOI

Abstract

In the application of bioinformatics, the existing algorithms cannot be directly and efficiently implement sequence pattern mining. Two fast and efficient biological sequence pattern mining algorithms for biological single sequence and multiple sequences are proposed in this paper. The concept of the basic pattern is proposed, and on the basis of mining frequent basic patterns, the frequent pattern is excavated by constructing prefix trees for frequent basic patterns. The proposed algorithms implement rapid mining of frequent patterns of biological sequences based on pattern prefix trees. In experiment the family sequence data in the pfam protein database is used to verify the performance of the proposed algorithm. The prediction results confirm that the proposed algorithms can’t only obtain the mining results with effective biological significance, but also improve the running time efficiency of the biological sequence pattern mining.

Full Text