Online sequential ensembling of predictive fuzzy systems

Edwin Lughofer,Mahardhika Pratama

doi:10.1007/s12530-021-09398-x

Edwin Lughofer, Mahardhika Pratama

Open Access

PDF Available

https://doi.org/10.1007/s12530-021-09398-x

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Evolving fuzzy systems (EFS) have enjoyed a wide attraction in the community to handle learning from data streams in an incremental, single-pass and transparent manner. The main concentration so far lied in the development of approaches for single EFS models, basically used for prediction purposes. Forgetting mechanisms have been used to increase their flexibility, especially for the purpose to adapt quickly to changing situations such as drifting data distributions. These require forgetting factors steering the degree of timely out-weighing older learned concepts, whose adequate setting in advance or in adaptive fashion is not an easy and not a fully resolved task. In this paper, we propose a new concept of learning fuzzy systems from data streams, which we call online sequential ensembling of fuzzy systems (OS-FS). It is able to model the recent dependencies in streams on a chunk-wise basis: for each new incoming chunk, a new fuzzy model is trained from scratch and added to the ensemble (of fuzzy systems trained before). This induces (i) maximal flexibility in terms of being able to apply variable chunk sizes according to the actual system delay in receiving target values and (ii) fast reaction possibilities in the case of arising drifts. The latter are realized with specific prediction techniques on new data chunks based on the sequential ensemble members trained so far over time. We propose four different prediction variants including various weighting concepts in order to put higher weights on the members with higher inference certainty during the amalgamation of predictions of single members to a final prediction. In this sense, older members, which keep in mind knowledge about past states, may get dynamically reactivated in the case of cyclic drifts, which induce dynamic changes in the process behavior which are re-occurring from time to time later. Furthermore, we integrate a concept for properly resolving possible contradictions among members with similar inference certainties. The reaction onto drifts is thus autonomously handled on demand and on the fly during the prediction stage (and not during model adaptation/evolution stage as conventionally done in single EFS models), which yields enormous flexibility. Finally, in order to cope with large-scale and (theoretically) infinite data streams within a reasonable amount of prediction time, we demonstrate two concepts for pruning past ensemble members, one based on atypical high error trends of single members and one based on the non-diversity of ensemble members. The results based on two data streams showed significantly improved performance compared to single EFS models in terms of a better convergence of the accumulated chunk-wise ahead prediction error trends, especially in the case of regular and cyclic drifts. Moreover, the more advanced prediction schemes could significantly outperform standard averaging over all members’ outputs. Furthermore, resolving contradictory outputs among members helped to improve the performance of the sequential ensemble further. Results on a wider range of data streams from different application scenarios showed (i) improved error trend lines over single EFS models, as well as over related AI methods OS-ELM and MLPs neural networks retrained on data chunks, and (ii) slightly worse trend lines than on-line bagged EFS (as specific EFS ensembles), but with around 100 times faster processing times (achieving low processing times way below requiring milli-seconds for single samples updates).

Highlights

Due to the increasing complexity, dynamics and non-stationarity of current industrial installations (Lughofer and Sayed-Mouchaweh 2019; Angelov et al 2010, 2012), the static design of fuzzy systems, e.g. realized through coding of expert knowledge (Siler and Buckley 2005) or through conventional batch learning techniques from pre-collected off-line data (Nelles 2001; Babuska 1998), have become a bottleneck in terms of permanent adaptability and flexibility to react properly on changing situations
We show the error trend lines over time as achieved by single Evolving fuzzy systems (EFS) models, related SoA methods and by our new OS-fuzzy systems (FS) scheme in the same plots—-this serves for the purpose of a direct comparison whether our new approach brings some improvement, i.e. whether it can show a better decreasing error behavior or lower error trends in major parts of the streams
It is easy to realize that the single EFS model and the predictions schemes based on native averaging of ensemble members’ outputs and weighted averaging using coverage degree-based weights perform very over the whole stream: showing a significant sudden increase in the case of the abrupt drift and a more moderate one when the gradual cyclic drift starts (whereas showing robust and converging behavior in-between and towards the end of the stream)

Summary

Introduction

Due to the increasing complexity, dynamics and non-stationarity of current industrial installations (Lughofer and Sayed-Mouchaweh 2019; Angelov et al 2010, 2012), the static design of fuzzy systems, e.g. realized through coding of expert knowledge (Siler and Buckley 2005) or through conventional batch learning techniques from pre-collected off-line data (Nelles 2001; Babuska 1998), have become a bottleneck in terms of permanent adaptability and flexibility to react properly on changing situations. This has been achieved within a wider range of real-world applications, e.g., predictive maintenance, fault detection and prognosis, time-series forecasting, (bio-)medical applications, user behavior identification, to name a few, see Lughofer (2016) for a longer list This is because E(N)FS possess the ability (i) to learn their parameters and structure from stream samples in an incremental, singlepass manner achieving fast stream processing capabilities, (ii) to adapt quickly to regular system fluctuations and to new operating conditions by expanding their knowledge on the fly, and (ii) to properly react on drifting (changing) system situations (Khamassi et al 2017) and nonstationary environments by forgetting mechanisms. This is typically achieved in a fully autonomous manner (Angelov 2011, 2012)

Methods

Results

Conclusion