Abstract

AbstractProcessing data streams requires new demands not existent on static environments. In online learning, the probability distribution of the data can often change over time (concept drift). The prequential assessment methodology is commonly used to evaluate the performance of classifiers in data streams with stationary and non‐stationary distributions. It is based on the premise that the purpose of statistical inference is to make sequential probability forecasts for future observations, rather than to express information about the past accuracy achieved. This article empirically evaluates the prequential methodology considering its three common strategies used to update the prediction model, namely, Basic Window, Sliding Window, and Fading Factors. Specifically, it aims to identify which of these variations is the most accurate for the experimental evaluation of the past results in scenarios where concept drifts occur, with greater interest in the accuracy observed within the total data flow. The prequential accuracy of the three variations and the real accuracy obtained in the learning process of each dataset are the basis for this evaluation. The results of the carried‐out experiments suggest that the use of Prequential with the Sliding Window variation is the best alternative.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.