RiskLogitboost Regression for Rare Events in Binary Response: An Econometric Approach

Jessica Pesantez-Narvaez,Montserrat Guillen,Manuela Alcañiz

doi:10.3390/math9050579

Abstract

A boosting-based machine learning algorithm is presented to model a binary response with large imbalance, i.e., a rare event. The new method (i) reduces the prediction error of the rare class, and (ii) approximates an econometric model that allows interpretability. RiskLogitboost regression includes a weighting mechanism that oversamples or undersamples observations according to their misclassification likelihood and a generalized least squares bias correction strategy to reduce the prediction error. An illustration using a real French third-party liability motor insurance data set is presented. The results show that RiskLogitboost regression improves the rate of detection of rare events compared to some boosting-based and tree-based algorithms and some existing methods designed to treat imbalanced responses.

Highlights

Research on rare events is steadily increasing in real-world applications of risk management
Through publicly available data sets in the library CASdatasets in R. It contains 413,169 observations that were recorded mostly in one year about risk factors for third-party liability motor policies. This data set contains the following information about vehicle characteristics: The power of the car ordered by category (Power); the car brand divided into seven categories (Brand); the fuel type, either diesel or regular (Gas)
The results provided by the RiskLogitboost regression suggest that the likelihood of a policy holder having an accident increased if they had e, k, l, m, n, o type Power vehicle; in particular, drivers with o–type Power were the most likely to have an accident among all types of Power

Summary

Introduction

Research on rare events is steadily increasing in real-world applications of risk management. Very few papers in this field have been devoted to studying rare events in binary response such as [25,26,27], and even fewer that go beyond econometric methods, such as [9], which employs advanced machine learning methods. Several machine learning methods are considered as black boxes in terms of interpretation. They are frequently interpreted using single metrics such as classification accuracy as unique descriptions of complex tasks [32], and they are not able to provide robust explanations for high-risk environments.

Background

Boosting Methods

Transformation : e

Penalized Regression Methods

Interpretable Machine Learning

The Rare Event Problem with RiskLogitboost Regression

RiskLogitboost Regression Weighting Mechanism to Improve Rare-Class Learning

Bias Correction with Weights

RiskLogitboost Regression

Illustrative Data

Discussion of Results

Predictive Performance of Extremes

Interpretable RiskLogitboost Regression

Findings

Conclusions

Methods

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Mathematics	Publication Date: Mar 9, 2021
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

RiskLogitboost Regression for Rare Events in Binary Response: An Econometric Approach

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Mathematics

Lead the way for us

Similar Papers

Author response: Limitations of principal components in quantitative genetic association models for human studies
Yiqi Yao ... Alejandro Ochoa
-
Yiqi Yao, et. al.Yiqi Yao ... Alejandro Ochoa
25 Apr 2023
25 Apr 2023

Decision letter: Limitations of principal components in quantitative genetic association models for human studies
Magnus Nordborg ... Detlef Weigel
-
Magnus Nordborg, et. al.Magnus Nordborg ... Detlef Weigel
04 Jul 2022
04 Jul 2022

Editor's evaluation: Limitations of principal components in quantitative genetic association models for human studies
Magnus Nordborg
-
Magnus NordborgMagnus Nordborg
04 Jul 2022
04 Jul 2022

Adverse drug reactions - examples of detection of rare events using databases.
Esther W Chan ... Celine S L Chui
British journal of clinical pharmacology | VOL. 80
Esther W Chan, et. al.Esther W Chan ... Celine S L Chui
01 Jun 2015
British journal of clinical pharmacology | VOL. 80

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

RiskLogitboost Regression for Rare Events in Binary Response: An Econometric Approach

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Mathematics