Imputation for Repeated Bounded Outcome Data: Statistical and Machine-Learning Approaches

Urko Aguirre-Larracoechea,Cruz E Borges

doi:10.3390/math9172081

Urko Aguirre-Larracoechea, Cruz E Borges

Open Access

https://doi.org/10.3390/math9172081

Copy DOI

Abstract

Real-life data are bounded and heavy-tailed variables. Zero-one-inflated beta (ZOIB) regression is used for modelling them. There are no appropriate methods to address the problem of missing data in repeated bounded outcomes. We developed an imputation method using ZOIB (i-ZOIB) and compared its performance with those of the naïve and machine-learning methods, using different distribution shapes and settings designed in the simulation study. The performance was measured employing the absolute error (MAE), root-mean-square-error (RMSE) and the unscaled mean bounded relative absolute error (UMBRAE) methods. The results varied depending on the missingness rate and mechanism. The i-ZOIB and the machine-learning ANN, SVR and RF methods showed the best performance.

Highlights

In any research study, one of the most important tasks is data analysis
Little and Rubin [2] have classified these factors into three groups: (a) missing completely at random (MCAR): the missing information is due to chance; (b) missing at random (MAR): the lack of information is conditioned solely by the observed values; and (c) missing not at random (MNAR): the missing information depends on both missing and non-missing information
Based on the findings provided by the unscaled mean bounded relative absolute error (UMBRAE) boxplots, all methods had similar performance to i-Zero-one-inflated beta (ZOIB)

Summary

Introduction

One of the most important tasks is data analysis The results of such analysis can support or refute the hypotheses proposed by the researchers. It is, important to have high-quality data to draw and extrapolate the conclusions. Missing data are among the most frequent and often-evaluated problems in all types of surveys, especially in repeated or longitudinal studies. In the latter type of studies, the missingness or dropout rates can be affected by many known and unrelated factors such as refusal to participate, death of the subject, etc. Little and Rubin [2] have classified these factors into three groups: (a) missing completely at random (MCAR): the missing information is due to chance; (b) missing at random (MAR): the lack of information is conditioned solely by the observed values; and (c) missing not at random (MNAR): the missing information depends on both missing and non-missing information

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Imputation for Repeated Bounded Outcome Data: Statistical and Machine-Learning Approaches

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Mathematics

Lead the way for us

Journal: Mathematics	Publication Date: Aug 28, 2021
License type: CC BY 4.0

Similar Papers

Prediction of dose deposition matrix using voxel features driven machine learning approach.
Shengxiu Jiao ... Xiaoqian Zhao
The British Journal of Radiology | VOL. 96
Shengxiu Jiao, et. al.Shengxiu Jiao ... Xiaoqian Zhao
06 Mar 2023
The British Journal of Radiology | VOL. 96

A review of machine learning applications in wildfire science and management
Piyush Jain ... Mark Crowley
Environmental Reviews | VOL. 28
Piyush Jain, et. al.Piyush Jain ... Mark Crowley
28 Jul 2020
Environmental Reviews | VOL. 28

Context- and Physiology-aware Machine Learning for Upper-Limb Myocontrol
Gauravkumar K Patel
-
Gauravkumar K PatelGauravkumar K Patel
21 Feb 2022
21 Feb 2022

Chapter 4 - Forecasting week-ahead hourly electricity prices in Belgium with statistical and machine learning methods
Evangelos Spiliotis ... Fotios Petropoulos
Mathematical Modelling of Contemporary Electricity Markets | VOL. -
Evangelos Spiliotis, et. al.Evangelos Spiliotis ... Fotios Petropoulos
01 Jan 2020
Mathematical Modelling of Contemporary Electricity Markets | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Imputation for Repeated Bounded Outcome Data: Statistical and Machine-Learning Approaches

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Mathematics