Abstract
Real-life data are bounded and heavy-tailed variables. Zero-one-inflated beta (ZOIB) regression is used for modelling them. There are no appropriate methods to address the problem of missing data in repeated bounded outcomes. We developed an imputation method using ZOIB (i-ZOIB) and compared its performance with those of the naïve and machine-learning methods, using different distribution shapes and settings designed in the simulation study. The performance was measured employing the absolute error (MAE), root-mean-square-error (RMSE) and the unscaled mean bounded relative absolute error (UMBRAE) methods. The results varied depending on the missingness rate and mechanism. The i-ZOIB and the machine-learning ANN, SVR and RF methods showed the best performance.
Highlights
In any research study, one of the most important tasks is data analysis
Little and Rubin [2] have classified these factors into three groups: (a) missing completely at random (MCAR): the missing information is due to chance; (b) missing at random (MAR): the lack of information is conditioned solely by the observed values; and (c) missing not at random (MNAR): the missing information depends on both missing and non-missing information
Based on the findings provided by the unscaled mean bounded relative absolute error (UMBRAE) boxplots, all methods had similar performance to i-Zero-one-inflated beta (ZOIB)
Summary
One of the most important tasks is data analysis The results of such analysis can support or refute the hypotheses proposed by the researchers. It is, important to have high-quality data to draw and extrapolate the conclusions. Missing data are among the most frequent and often-evaluated problems in all types of surveys, especially in repeated or longitudinal studies. In the latter type of studies, the missingness or dropout rates can be affected by many known and unrelated factors such as refusal to participate, death of the subject, etc. Little and Rubin [2] have classified these factors into three groups: (a) missing completely at random (MCAR): the missing information is due to chance; (b) missing at random (MAR): the lack of information is conditioned solely by the observed values; and (c) missing not at random (MNAR): the missing information depends on both missing and non-missing information
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.