Abstract

In recent years, order-preserving submatrix (OPSM) model has been widely used in gene expression data analysis. Since it focuses on the changes between the elements rather than the real value, it shows better robustness and statistical significance among results than other models do. Currently, many OPSM algorithms are heuristic. They cannot mine all OPSMs as well as the deep OPSMs which are of biological significance in gene expression data. In this paper, an exact algorithm is proposed to find OPSMs by using frequent sequential pattern mining method. Firstly, we find out all common subsequences (ACS) between any two rows through dynamic programming. Then, we store them into a suffix tree. After that, we can get all OPSMs in this suffix tree, including deep OPSMs. Verified by the real gene data and artificially synthesised data, it is proved that our algorithm is efficient and meaningful.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call