Identifying potential significant factors impacting zero-inflated proportion data.

Mélina Ribaud,Samuel Soubeyrand,Joseph Hughes,Edith Gabriel

doi:10.1002/sim.9814

Abstract

Classical supervised methods like linear regression and decision trees are not completely adapted for identifying impacting factors on a response variable corresponding to zero-inflated proportion data (ZIPD) that are dependent, continuous and bounded. In this article we propose a within-block permutation-based methodology to identify factors (discrete or continuous) that are significantly correlated with ZIPD, we propose a performance indicator quantifying the percentage of correlation explained by the subset of significant factors, and we show how to predict the ranks of the response variables conditionally on the observation of these factors. The methodology is illustrated on simulated data and on two real data sets dealing with epidemiology. In the first data set, ZIPD correspond to probabilities of transmission of Influenza between horses. In the second data set, ZIPD correspond to probabilities that geographic entities (eg, states and countries) have the same COVID-19 mortality dynamics.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Identifying potential significant factors impacting zero-inflated proportion data.

Abstract

Talk to us

Similar Papers

More From: Statistics in Medicine

Lead the way for us

Similar Papers

A Primer on Machine Learning.
Audrene S Edwards ... Bruce Kaplan
Transplantation | VOL. 105
Audrene S Edwards, et. al.Audrene S Edwards ... Bruce Kaplan
18 Aug 2020
Transplantation | VOL. 105

Comparison and Analysis of the Effectiveness of Linear Regression, Decision Tree, and Random Forest Models for Health Insurance Premium Forecasting
Yaowen Hu
Advances in Economics, Management and Political Sciences | VOL. 79
Yaowen HuYaowen Hu
26 Apr 2024
Advances in Economics, Management and Political Sciences | VOL. 79

PREDICTIVE TECHNIQUES AND ARTIFICIAL INTELLIGENCE MODELS IN THE MANAGEMENT OF DEFAULTED PUBLIC DEBTS: COMPARISON BETWEEN LINEAR REGRESSION AND DECISION TREES
Eduardo Silva Vasconcelos ... Silvio Paula Ribeiro
-
Eduardo Silva Vasconcelos, et. al.Eduardo Silva Vasconcelos ... Silvio Paula Ribeiro
04 Nov 2024
04 Nov 2024

Stock Market Analysis Using Linear Regression and Decision Tree Regression
Rezaul Karim ... Md Khorshed Alam
-
Rezaul Karim, et. al.Rezaul Karim ... Md Khorshed Alam
10 Aug 2021
10 Aug 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Identifying potential significant factors impacting zero-inflated proportion data.

Abstract

Talk to us

Similar Papers

More From: Statistics in Medicine