Stream Classification Algorithm Based on Decision Tree

Jinlin Guo,Xinwei Li,Haoran Wang,Li Zhang

doi:10.1155/2021/3103053

Jinlin Guo, Xinwei Li + Show 2 more

Open Access

https://doi.org/10.1155/2021/3103053

Copy DOI

Abstract

Due to the rise of many fields such as e-commerce platforms, a large number of stream data has emerged. The incomplete labeling problem and concept drift problem of these data pose a huge challenge to the existing stream data classification methods. In this respect, a dynamic stream data classification algorithm is proposed for the stream data. For the incomplete labeling problem, this method introduces randomization and iterative strategy based on the very fast decision tree VFDT algorithm to design an iterative integration algorithm, and the algorithm uses the previous model classification result as the next model input and implements the voting mechanism for new data classification. At the same time, the window mechanism is used to store data and calculate the data distribution characteristics in the window, then, combined with the calculated result and the predicted amount of data to adjust the size of the sliding window. Experiments show the superiority of the algorithm in classification accuracy. The aim of the study is to compare different algorithms to evaluate whether classification model adapts to the current data environment.

Highlights

With the development of the Internet, sensors, and the Internet of ings, massive streaming data has emerged
Streaming data is data that is continuously generated by different sources. Such data should be processed incrementally using stream processing techniques without having access to all of the data. It includes a variety of data formats, such as log files generated by web applications, online shopping data, traffic monitoring, social networking site information, and geospatial and meteorological satellite data. ese stream data imply a large amount of information that is instructive for real-world decision-making. erefore, many scholars analyze the data to obtain useful information which guides people to make scientific decisions, such as e-commerce platform personalized real-time recommendations, stock market monitoring, network intrusion, abnormal fraud monitoring, most smart device applications, traffic monitoring, and real-time motion analysis, etc
Aiming at the above problems, this paper proposes a real-time streaming data dynamic classification algorithm that adopts decision tree as the base classification model and combines the idea of an integrated classification algorithm to change the mode of a single classification model and updates the classification model periodically; in the detection of concept drift, the degree of conceptual drift is detected by calculating the difference between the data of front and end part in the sliding window

Summary

Introduction

With the development of the Internet, sensors, and the Internet of ings, massive streaming data has emerged. As an important branch of data mining, the classification problem has important practical significance in the fields of financial credit rating, prevention of telecom fraud, and detection of network intrusion [1]. Some existing data mining schemes and algorithms fail to fully consider the characteristics of stream data and practical application scenarios, such as concept drift, incomplete labeling, and uneven data flow rate. Aiming at the above problems, this paper proposes a real-time streaming data dynamic classification algorithm that adopts decision tree as the base classification model and combines the idea of an integrated classification algorithm to change the mode of a single classification model and updates the classification model periodically; in the detection of concept drift, the degree of conceptual drift is detected by calculating the difference between the data of front and end part in the sliding window.

Related Work and Theoretical Basis

Concept Drift Detection Algorithm

Sliding Window Dynamic Adjustment Algorithm

Findings

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Mobile Information Systems	Publication Date: Dec 21, 2021
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Stream Classification Algorithm Based on Decision Tree

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Mobile Information Systems

Lead the way for us

Similar Papers

Skewed Evolving Data Streams Classification with Actionable Knowledge Extraction using Data Approximation and Adaptive Classification Framework
Rahul A Patil ... Pramod D Patil
International Journal on Recent and Innovation Trends in Computing and Communication | VOL. 11
Rahul A Patil, et. al.Rahul A Patil ... Pramod D Patil
06 Feb 2023
International Journal on Recent and Innovation Trends in Computing and Communication | VOL. 11

Adapted One-versus-All Decision Trees for Data Stream Classification
S Hashemi ... M Kangavari
IEEE Transactions on Knowledge and Data Engineering | VOL. 21
S Hashemi, et. al.S Hashemi ... M Kangavari
01 May 2009
IEEE Transactions on Knowledge and Data Engineering | VOL. 21

Overview of Wind and Photovoltaic Data Stream Classification and Data Drift Issues
Xinchun Zhu ... Yelong Wu
Energies | VOL. 17
Xinchun Zhu, et. al.Xinchun Zhu ... Yelong Wu
01 Sep 2024
Energies | VOL. 17

An adaptive ensemble classifier for mining complex noisy instances in data streams
Md Rejaul Karim ... Dewan Md Farid
-
Md Rejaul Karim, et. al.Md Rejaul Karim ... Dewan Md Farid
01 May 2014
01 May 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Stream Classification Algorithm Based on Decision Tree

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Mobile Information Systems