Missing Mechanism Research Articles

Incomplete data is a prevalent complication in longitudinal studies due to individuals' drop-out before intended completion time. Currently available methods via commercial software for analyzing incomplete longitudinal data at best rely on the ignorability of the drop-outs. If the underlying missing mechanism was non-ignorable, potential bias arises in the statistical inferences. To remove the bias when the drop-out is non-ignorable, joint complete-data and drop-out models have been proposed which involve computational difficulties and untestable assumptions. Since the critical ignorability assumption is unverifiable based on the observed part of the sample, some local sensitivity indices have been proposed in the literature. Specifically, Eftekhari Mahabadi (Second-order local sensitivity to non-ignorability in Bayesian inferences. Stat Med 2018;59:55-95) proposed a second-order local sensitivity tool for Bayesian analysis of cross-sectional studies and show its better performance for handling bias compared with the first-order ones. In this paper, we aim to extend this index for the Bayesian sensitivity analysis of normal longitudinal studies with drop-outs. The index is driven based on a selection model for the drop-out mechanism and a Bayesian linear mixed-effect complete-data model. The presented formulas are calculated using the posterior estimation and draws from the simpler ignorable model. The method is illustrated via some simulation studies and sensitivity analysis of a real antidepressant clinical trial data. Overall, the numerical analysis showed that when repeated outcomes are subject to missingness, regression coefficient estimates are nearly approximated well by a linear function in the neighbourhood of MAR model, but there are a considerable amount of second-order sensitivity for the error term and random effect variances in Bayesian linear mixed-effect model framework.

Read full abstract

The Health and Aging Brain Study–Health Disparities (HABS–HD) project seeks to understand the biological, social, and environmental factors that impact brain aging among diverse communities. A common issue for HABS–HD is missing data. It is impossible to achieve accurate machine learning (ML) if data contain missing values. Therefore, developing a new imputation methodology has become an urgent task for HABS–HD. The three missing data assumptions, (1) missing completely at random (MCAR), (2) missing at random (MAR), and (3) missing not at random (MNAR), necessitate distinct imputation approaches for each mechanism of missingness. Several popular imputation methods, including listwise deletion, min, mean, predictive mean matching (PMM), classification and regression trees (CART), and missForest, may result in biased outcomes and reduced statistical power when applied to downstream analyses such as testing hypotheses related to clinical variables or utilizing machine learning to predict AD or MCI. Moreover, these commonly used imputation techniques can produce unreliable estimates of missing values if they do not account for the missingness mechanisms or if there is an inconsistency between the imputation method and the missing data mechanism in HABS–HD. Therefore, we proposed a three-step workflow to handle missing data in HABS–HD: (1) missing data evaluation, (2) imputation, and (3) imputation evaluation. First, we explored the missingness in HABS–HD. Then, we developed a machine learning-based multiple imputation method (MLMI) for imputing missing values. We built four ML-based imputation models (support vector machine (SVM), random forest (RF), extreme gradient boosting (XGB), and lasso and elastic-net regularized generalized linear model (GLMNET)) and adapted the four ML-based models to multiple imputations using the simple averaging method. Lastly, we evaluated and compared MLMI with other common methods. Our results showed that the three-step workflow worked well for handling missing values in HABS–HD and the ML-based multiple imputation method outperformed other common methods in terms of prediction performance and change in distribution and correlation. The choice of missing handling methodology has a significant impact on the accompanying statistical analyses of HABS–HD. The conceptual three-step workflow and the ML-based multiple imputation method perform well for our Alzheimer’s disease models. They can also be applied to other disease data analyses.

Read full abstract

Missing Mechanism Research Articles

Related Topics

Articles published on Missing Mechanism

Handling missing data when estimating causal effects with targeted maximum likelihood estimation.

Missing Data Analysis.

Global photosynthetic capacity jointly determined by enzyme kinetics and eco-evo-environmental drivers

Compatibility in Missing Data Handling Across the Prediction Model Pipeline: A Simulation Study.

Quantifying bias due to missing data in quality of life surveys of advanced-stage cancer patients.

A Model-Based Approach to the Disentanglement and Differential Treatment of Engaged and Disengaged Item Omissions

Missing data: Issues, concepts, methods

A Novel Truncated Normal Tensor Completion Method for Multi-Source Fusion Data

Analysis of Incomplete Data Under Different Missingness Mechanism using Imputation Methods for Wheat Genotypes

Smdi: an R package to perform structural missing data investigations on partially observed confounders in real-world evidence studies.

Studying the dynamics of the drug processing of pyrazinamide in Mycobacterium tuberculosis.

Oracle-efficient estimation for the mean function of missing covariate data based on noparametrically estimated selection probabilities

Dealing with missing observations in the outcome and covariates in randomized controlled trials

Bayesian second-order sensitivity of longitudinal inferences to non-ignorability: an application to antidepressant clinical trial data.

Mallows model averaging based on kernel regression imputation with responses missing at random

A modified Nadaraya–Watson procedure for variable selection and nonparametric prediction with missing data

Imputation of missing values in residential building monitored data: Energy consumption, behavior, and environment information

A Machine Learning-Based Multiple Imputation Method for the Health and Aging Brain Study–Health Disparities

Deep Learning Methods for Omics Data Imputation.

Assessable and interpretable sensitivity analysis in the pattern graph framework for nonignorable missingness mechanisms.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Missing Mechanism Research Articles

Related Topics

Articles published on Missing Mechanism

Handling missing data when estimating causal effects with targeted maximum likelihood estimation.

Missing Data Analysis.

Global photosynthetic capacity jointly determined by enzyme kinetics and eco-evo-environmental drivers

Compatibility in Missing Data Handling Across the Prediction Model Pipeline: A Simulation Study.

Quantifying bias due to missing data in quality of life surveys of advanced-stage cancer patients.

A Model-Based Approach to the Disentanglement and Differential Treatment of Engaged and Disengaged Item Omissions

Missing data: Issues, concepts, methods

A Novel Truncated Normal Tensor Completion Method for Multi-Source Fusion Data

Analysis of Incomplete Data Under Different Missingness Mechanism using Imputation Methods for Wheat Genotypes

Smdi: an R package to perform structural missing data investigations on partially observed confounders in real-world evidence studies.

Studying the dynamics of the drug processing of pyrazinamide in Mycobacterium tuberculosis.

Oracle-efficient estimation for the mean function of missing covariate data based on noparametrically estimated selection probabilities

Dealing with missing observations in the outcome and covariates in randomized controlled trials

Bayesian second-order sensitivity of longitudinal inferences to non-ignorability: an application to antidepressant clinical trial data.

Mallows model averaging based on kernel regression imputation with responses missing at random

A modified Nadaraya–Watson procedure for variable selection and nonparametric prediction with missing data

Imputation of missing values in residential building monitored data: Energy consumption, behavior, and environment information

A Machine Learning-Based Multiple Imputation Method for the Health and Aging Brain Study–Health Disparities

Deep Learning Methods for Omics Data Imputation.

Assessable and interpretable sensitivity analysis in the pattern graph framework for nonignorable missingness mechanisms.