Abstract

The problem of adaptive learning from evolving and possibly non-stationary data streams has attracted a lot of interest in machine learning in the recent past, and also stimulated research in related fields, such as computational intelligence and fuzzy systems. In particular, several rule-based methods for the incremental induction of regression models have been proposed. In this paper, we develop a method that combines the strengths of two existing approaches rooted in different learning paradigms. More concretely, our method adopts basic principles of the state-of-the-art learning algorithm AMRules and enriches them by the representational advantages of fuzzy rules. In a comprehensive experimental study, TSK-Streams is shown to be highly competitive in terms of performance.

Highlights

  • In many practical applications of machine learning and predictive modeling, data is produced incrementally in the course of time and observed in the form of a continuous, potentially unbounded stream of observations

  • The data sets starting with prefix BNG- are obtained from the online machine learning platform OpenML (Bischl et al 2017); these large data streams are drawn from Bayesian networks as generative models, after constructing each network from a relatively small data set (we refer to van Rijn et al (2014) for more details)

  • We introduced a new fuzzy rule learner for adaptive regression on data streams, called TSK-Streams

Read more

Summary

Introduction

In many practical applications of machine learning and predictive modeling, data is produced incrementally in the course of time and observed in the form of a continuous, potentially unbounded stream of observations. – We give a concise overview of regression learning on data streams as well as a systematic comparison of existing methods with regard to properties such as discretization of features, splitting criteria for rules, etc. This overview helps to better understand the specificities and characteristics of approaches originating from different research fields, as well as to position our own approach. Compared to the three-layered discretization architecture used by Shaker et al (2017), the use of E-BST for constructing candidate fuzzy sets has a number of advantages in the context of online learning Most notably, it comes with a reduction of complexity from linear to logarithmic (in the number of candidate extensions).

Learning regression models on data streams
An overview of existing methods
Trees versus rules
Binary versus gradual membership
Discretization
Splitting criteria
Statistical tests versus engineered parameters
Basic concepts from fuzzy logic
TSK fuzzy systems A TSK rule Ri has the following structure
Online rule induction
Variance reduction
Error reduction
Single Extension
All Extensions
Rule consequents
Model structure
Change detection
Empirical evaluation
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.