Mining distinguishing subsequence patterns with nonoverlapping condition

Youxi Wu,Jingyu Liu,Yan Li,Jing Liu,Yuehua Wang,Ming Yu

doi:10.1007/s10586-017-1671-0

Abstract

Distinguishing subsequence patterns mining aims to discover the differences between different categories of sequence databases and to express characteristics of classes. It plays an important role in biomedicine, feature information selection, time-series classification, and other areas. The existing distinguishing subsequence patterns mining only focuses on whether a pattern appears in a sequence, regardless of the number of occurrences of the pattern in the sequence and the proportion of the pattern in the entire sequence database, which affects the discovery of the distinguishing patterns when there are a large number of irrelevant occurrences. Therefore, the nonoverlapping conditional distinguishing subsequence patterns mining algorithm is proposed. In this paper, we focus on the number of nonoverlapping occurrences that effectively reduce the number of irrelevant or redundant occurrences, and in this way, the number of occurrences can be better grasped. At the same time, we use a specially designed data structure, namely, a Nettree, to avoid backtracking. In addition, we use the distinguishing patterns as classification features, and carry out classification experiments on DNA sequences and time-series data with two classes. Extensive experimental results and comparisons demonstrate the efficiency of the proposed algorithm and the correctness of the feature extraction.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Mining distinguishing subsequence patterns with nonoverlapping condition

Abstract

Talk to us

Similar Papers

More From: Cluster Computing

Lead the way for us

Journal: Cluster Computing	Publication Date: Jan 9, 2018
Citations: 11

Similar Papers

Extraction of Features for Time Series Classification Using Noise Injection
Gyu Il Kim ... Kyungyong Chung
Sensors | VOL. 24
Gyu Il Kim, et. al.Gyu Il Kim ... Kyungyong Chung
02 Oct 2024
Sensors | VOL. 24

NOSEP: Nonoverlapping Sequence Pattern Mining With Gap Constraints.
Youxi Wu ... Xindong Wu
IEEE Transactions on Cybernetics | VOL. 48
Youxi Wu, et. al.Youxi Wu ... Xindong Wu
28 Sep 2017
IEEE Transactions on Cybernetics | VOL. 48

Mining distinctive DNA patterns from the upstream of human coding&non-coding genes via class frequency distribution
Jing-Doo Wang ... Wen-Ling Chan
-
Jing-Doo Wang, et. al.Jing-Doo Wang ... Wen-Ling Chan
01 Oct 2016
01 Oct 2016

33] Analysis of compositionally biased regions in sequence databases
John C Wootton ... Scott Federhen
Methods in Enzymology | VOL. 266
John C Wootton, et. al.John C Wootton ... Scott Federhen
01 Jan 1996
Methods in Enzymology | VOL. 266

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Mining distinguishing subsequence patterns with nonoverlapping condition

Abstract

Talk to us

Similar Papers

More From: Cluster Computing