Abstract
AbstractData stream mining has taken over as a new field of research during past few years. It has gained lot of attention recently due to its challenging characteristics like dynamic nature, huge data size and continuous flow, temporal, etc. Processing and classifying these types of data confront many issues in terms of storage and analysis both. Moreover, existing traditional classification algorithms do not fit well with data stream, as they process over the data which is stored in memory for once and all. Data streams if taken up for mining can render very crucial information for any non-stationary system from which it is generated. Also, storing data streams is not feasible as storage cost increases with the increasing data size. But the algorithm designed for data streams should have characteristics which address incremental and multi-pass approach to deal with new data and to analyze exiting at the same time. Data stream classification aims at labeling data, and it is nearly impossible to do in real life due to the characteristics of data which act as challenges. Traditional data mining algorithm fits limited number of instances, and this model would not work with data stream. In this paper, we have focused discussing data stream classification algorithms and simulated the same with real and synthetic dataset to understand performance parameters of discussed algorithms.KeywordsData streamMiningClassificationMethodChallenges
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have