A novel match point mapping algorithm based on dominant point for large-scale mlcs problem

Shiwei Wei,Quanyou Zhao,Xunzhang Li,Na Zhao

doi:10.1109/cis58238.2022.00019

Shiwei Wei, Quanyou Zhao + Show 2 more

https://doi.org/10.1109/cis58238.2022.00019

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Finding the Multiple Longest Common Subsequences (MLCS) is a fundamental problem in many fields such as bioinformatics, computational genomics and pattern recognition. Existing algorithms for finding MLCSs from sequences are not suitable for the long and large-scale sequences due to their high time and space consumption. To overcome this problem, a new DAG (directed acyclic graph) model and a Novel Match Point Mapping algorithm (NMPM) based on dominant point are proprosed in this paper. In the DAG, there is no duplicate match point and each match point is mapped to a unique integer identifier. The DAG can be efficiently built by continually calculating successors of each match point. What is more, the high-dimensional match points can be removed if they are no longer used during the construction of DAG. Therefore, a great deal of memory space will be saved. The experiment results reveal that our new algorithm outperforms other leading algorithms, especially for large-scale MLCS problem.

Full Text