Abstract
EPs (Extracting Frequent Patterns) from the continuous transactional data streams is a challenging and critical task in some of the applications, such as web mining, data analysis and retail market, prediction and network monitoring, or analysis of stock market exchange data. Many algorithms have been developed previously for mining FPs (Frequent Patterns) from a data stream. Such algorithms are currently highly required to develop new solutions and approaches to the precise handling of data streams. New techniques, solutions, or approaches are developed to address unbounded, ordered, and continuous sequences of data and for the generation of data at a rapid speed from data streams. Hence, extracting FPs using fresh or recent data involves the high-level analysis of data streams. We have suggested an efficient technique for the window sliding model; this technique extracts new and fresh FPs from high-speed data streams. In this study, a CPILT (Compacted Tree Compact Pattern Tree) is developed to capture the latest contents in the stream and to efficiently remove outdated contents from the data stream. The main concept introduced in this work on CPILT is the dynamic restructuring of a tree, which is helpful in producing a compacted tree and the frequency descending structure of a tree on runtime. With the help of the mining technique of FP growth, a complete list of new and fresh FPs is obtained from a CPILT using an existing window. The memory usage and time complexity of the latest FPs in high-speed data streams can efficiently be determined through proper experimentation and analysis.
Highlights
Data streams are realtime, continuous, possibly infinite, fast, changing, and ordered, with a huge amount of sequences of items [1,2]
Stream mining must ensure that a new data stream be available immediately whenever a request is made for such stream [4]
When transactions from a data stream with some numbers are placed and the order of the F-list changes considerably compared with the existing frequency-descending item order, the CPILT is readjusted dynamically, considering that the existing frequency descending item order and the order of the items are updated in the F-list with respect to the current one
Summary
Continuous, possibly infinite, fast, changing, and ordered, with a huge amount of sequences of items [1,2]. FPM for structures related to a highly compacted tree provides no guarantee, which is quite important in manage data streaming to avoid the overhead related to massive storage, decrease search space, and hasten FP growth-based frequent pattern mining operations. The lists of frequency counts in FPM [1] are used to store information for a batch that considers each node; with the usage of such types of lists, the tree size increases considerably. Another issue in FPM is related to storage overhead, where for every node, FPM uses extra batch pointers that indicate the last visited batch related to such node. Techniques designed for time and space efficiency should be accompanied and complemented with excellent accuracy in terms of results
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Mehran University Research Journal of Engineering and Technology
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.