Abstract

A frequent way for classification data is using a machine learning algorithm alongside ensemble methods like bagging and boosting. In earlier studies, these two algorithms have shown to be very accurate. The aim of this research is to discover performance of bagging and boosting to classify rainfall data obtained at the Sultan Syarif Kasim II Meteorological Station in Pekanbaru from 1 January 2018 until 31 July 2021. Rainfall data are classified into two categories: rainy and non-rainy. The parameters are average temperature, average humidity, sunshine duration, wind direction at maximum speed, and average wind speed. For comparison, this study developed Stochastic Gradient Boosting with Gradient Boosting Modelling and C5.0 from boosting, as well as Bagged Classification and Regression Tree (CART) and Random Forest from bagging. In order to generate reliable conclusions, each algorithm is run 30 times with repeated cross validation. The result demonstrates that Stochastic Gradient Boosting with Gradient Boosting Modelling is the best algorithm based on average accuracy.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.