Abstract

Missing values in databases is a common issue and almost inevitable, however, how works deal with it are rarely mentioned in most publications. Multiple imputation (MI) is an efficient method for statistical estimates of missing values from incomplete data. The objective of this study was to evaluate the efficiency of the MI using the MICE (Multivariate Imputation by Chained Equations) algorithm to fill in missing data in a database of soil physico-hydrical properties, and to show that it is more feasible to perform the imputation than the complete case analysis (CCA). Preliminary analysis of the database was performed to check the suitability of the proposed algorithm. Imputation of the missing data of each variable was adjusted using linear regression models. The variables with missing data comprise the model as the dependent variable and the other variables, which were correlated with the same, enter as covariates. The analysis was performed by comparing the values of the estimates, their standard errors and 95% confidence intervals. It was concluded that MICE presented better performance than CCA, since, although the statistical comparison of the two methods was similar, multiple imputation maintains the size of the database and preserves the general distribution. MI is a very prominent method to handle missing data. With this study, we aim to help more soil researchers to get started with implementing MI techniques instead of inferior approaches in order to improve statistical analysis accuracy. Our study confirmed that multiple imputation is applicable to missing data in soil properties database.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.