Abstract

AbstractThe generalized linear model (GLM) is a widely used method in traditional automobile insurance loss prediction. Ensemble learning algorithms have recently shown promising results in the realm of automobile insurance, providing a new option for loss prediction. In the age of big data, how to predict loss in automobile insurance more accurately is an urgent problem to be solved. Stacking is a hot issue in ensemble learning that has been effectively used in many fields, but few researchers have applied it to the field of automobile insurance. In this research, Stacking was innovatively introduced into loss prediction for automobile insurance to solve this problem. Three datasets related to automobile insurance were used. Adopting the Synthetic Minority Oversampling Technique for class balance, a model of claim occurrence was established using four methods, namely logistic regression in GLM and bagging, boosting, stacking in ensemble learning. The area under the receiver operating characteristic curve values and F1‐scores achieved with the four methods were then compared to assess classification performance. Ensemble algorithms were used to rank the importance of features in the FRE dataset. Finally, we combined probability with the bonus‐malus system to formulate a fairer transfer strategy. The results showed that the proposed approach performed better than the other methods on all datasets, with significantly enhanced prediction accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call