Abstract

“Data streams” is defined as class of data generated over “text, audio and video” channel in continuous form. The streams are of infinite length and may comprise of structured or unstructured data. With these features, it is difficult to store and process data streams with simple and static strategies. The processing of data stream poses four main challenges to researchers. These are infinite length, concept-evolution, concept-drift and feature evolution. Infinite-length is because the amount of data has no bounds. Concept-drift is due to slow changes in the concept of stream. Concept-evolution occurs due to presence of unknown classes in data. Feature-evolution is due to progression new features and regression of old features. To perform any analytics data streams, the conversion to knowledgable form is essential. The researcher in past have proposed various strategies, most of the research is focussed on problem of infinite-length and concept-drift. The research work presented in the paper describes a efficient string based methodology to process “data streams” and control the challenges of infinite-length, concept-evolution and concept-drift.

Highlights

  • With the advancement of technology and use of internet of things [IoT], the amount of data generated over device communication channels is exponentially increasing

  • The intersection operation performed on set Sk and set Sj of different classes Cj, and if result is more than 50 %, [3] it is declared as outlier due to “concept-drift”

  • In this combined approach OLINNDA works as a novel class detector and FAE is used for classification

Read more

Summary

Introduction

With the advancement of technology and use of internet of things [IoT], the amount of data generated over device communication channels is exponentially increasing. Data mining is one of the stream of “Database technologies” deals in processing large volume of structured and unstructured data. It was difficult to store and process the data generated over the communication channel, but in the present scenario researchers have developed methodologies to overcome the restriction. The data generated in text, audio, video format and is flowing from one network node to another, in un-interrupted fashion is denoted as “Data stream”. The main characteristics of streaming data are: continuity, dynamic nature and no defined format. Its features keep on changing regularly, which makes it difficult to process. The four main challenges in processing streaming data are: infinite-length, concept-evolution, concept-drift and feature-evolution

Methods
Findings
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.