Abstract
Wireless sensor networks (WSNs) are an important type of network for sensing the environment and collecting information. It can be deployed in almost every type of environment in the real world, providing a reliable and low-cost solution for management. Huge amounts of data are produced from WSNs all the time, and it is significant to process and analyze data effectively to support intelligent decision and management. However, the new characteristics of sensor data, such as rapid growth and frequent updates, bring new challenges to the mining algorithms, especially given the time constraints for intelligent decision-making. In this work, an efficient incremental mining algorithm for discovering sequential pattern (novel incremental algorithm, NIA) is proposed, in order to enhance the efficiency of the whole mining process. First, a reasoned proof is given to demonstrate how to update the frequent sequences incrementally, and the mining space is greatly narrowed based on the proof. Second, an improvement is made on PrefixSpan, which is a classic sequential pattern mining algorithm with a high-complexity recursive process. The improved algorithm, named PrefixSpan+, utilizes a mapping structure to extend the prefixes to sequential patterns, making the mining step more efficient. Third, a fast support number-counting algorithm is presented to choose frequent sequences from the potential frequent sequences. A reticular tree is constructed to store all the potential frequent sequences according to subordinate relations between them, and then the support degree can be efficiently calculated without scanning the original database repeatedly. NIA is compared with various kinds of mining algorithms via intensive experiments on the real monitoring datasets, benchmarking datasets and synthetic datasets from aspects including time cost, sensitivity of factors, and space cost. The results show that NIA performs better than the existed methods.
Highlights
Wireless sensor networks (WSNs) are made up of a large number of sensor nodes deployed in the monitored area, and by wirelessly communicating between the nodes, it forms a multi-hop self-organized network system to perceive, collect, and process the information of objects continuously, where the results are forwarded to the central node
If the sum of the support numbers, in DD and new, of the sequences produced by the combination OD and old is no less than min_sup × |new|, they might become frequent sequences, which means they should be involved in the set of potential frequent sequence
The growth rate of NIA’s In Figure 7, NIA had a lower time cost than Incspan, PFT, and STISPM as min sup = 2%, 1%, and 0.75% ; on average, it reduced the time cost by 49.4% compared to Incspan, 45.6% compared to PFT, and 27.1% compared to STISPM
Summary
WSNs are made up of a large number of sensor nodes deployed in the monitored area, and by wirelessly communicating between the nodes, it forms a multi-hop self-organized network system to perceive, collect, and process the information of objects continuously, where the results are forwarded to the central node. In market databases, an item has its own different price or profit and can be sold multiple copies in a transaction, which are not binary data To address this issue, some recent studies incorporated the concept of utility into classic SPM, leading to the emergence of high utility sequential pattern (HUSP) mining. Motivated by the real applications and the problems mentioned above, an efficient incremental mining algorithm for discovering sequential pattern named NIA (Novel Incremental Algorithm) is proposed. The main contributions of this paper are as follows: (1) the mining space is greatly narrowed based on the analysis of the sequence-related properties in the updating process; (2) PrefixSpan+, an improvement of the original PrefixSpan is proposed to improve the mining efficiency; and (3) a novel structure called a reticular sequence tree is designed to count the support number quickly.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.