Evolving data stream, especially with concept drift is generally accepted as a challenging data type for regression task, because it usually makes machine learning models trained on old data not adapting to new data, and leads to dramatic performance degradation as a result. Moreover, the behavior of a data stream may change in different modes and therefore introduces various concept drifts, e.g., abrupt, incremental, gradual, recurring, even more, complex concept drifts. Although there are some algorithms that can adapt to stationary data streams or a specific type of concept drift in non-stationary data streams, a wide range of practical applications call for machine learning regression models to handle multi-type of data streams. In this work, we propose an online learning strategy called adaptive long and short-term memories online Random Forests regression(ALSM-RFR), where an adaptive memory activation mechanism is designed to make the model switch adaptively between long-term and hybrid (long-term plus short-term) memory modes in the face of stationary data streams or non-stationary data streams with different types of concept drift. In particular, leaf and tree weights in random forests are used to learn information at different timescales, namely, long-term and short-term memories. Moreover, we devise an adaptive memory activation mechanism to formulate the switch decision of memory modes as a classification problem. Numerical experiments show remarkable improvements of the proposed method in the adaptability of stream types and predictive accuracy in data streams across several real datasets and synthetic datasets, compared to the state-of-the-art online approaches. Besides, the convergence and the influence of the parameters involved in our method are evaluated.
Read full abstract