Model-based Imputation Research Articles

The diagnostic process for Autism Spectrum Disorder (ASD) typically involves time-consuming assessments conducted by specialized physicians. To improve the efficiency of ASD screening, intelligent solutions based on machine learning have been proposed in the literature. However, many existing ML models lack the incorporation of medical tests and demographic features, which could potentially enhance their detection capabilities by considering affected features through traditional feature selection approaches. This study aims to address the aforementioned limitation by utilizing a real dataset containing 45 features and 983 patients. To achieve this goal, a two-phase methodology is employed. The first phase involves data preparation, including handling missing data through model-based imputation, normalizing the dataset using the Min-Max method, and selecting relevant features using traditional feature selection approaches based on affected features. In the second phase, seven ML classification techniques recommended by the literature, including Decision Trees (DT), Random Forest (RF), K-Nearest Neighbors (KNN), Support Vector Machine (SVM), AdaBoost, Gradient Boosting (GB), and Neural Network (NN), are utilized to develop ML models. These models are then trained and tested on the prepared dataset to evaluate their performance in detecting ASD. The performance of the ML models is assessed using various metrics, such as Accuracy, Recall, Precision, F1-score, AUC, Train time, and Test time. These metrics provide insights into the models' overall accuracy, sensitivity, specificity, and the trade-off between true positive and false positive rates. The results of the study highlight the effectiveness of utilizing traditional feature selection approaches based on affected features. Specifically, the GB model outperforms the other models with an accuracy of 87%, Recall of 87%, Precision of 86%, F1-score of 86%, AUC of 95%, Train time of 21.890, and Test time of 0.173. Additionally, a benchmarking analysis against five other studies reveals that the proposed methodology achieves a perfect score across three key areas. By considering affected features through traditional feature selection approaches, the developed ML models demonstrate improved performance and have the potential to enhance ASD screening and diagnosis processes.

Read full abstract

Handling of missing values in data analysis is the focus of attention in various research fields. Imputation is one method that is commonly used to overcome this problem of missing data. This systematic literature review research aims to present a comprehensive summary of the relevant scientific literature that describes the use of the imputation method in overcoming missing values. The literature search method is carried out using various academic databases and reliable sources of information. Relevant keywords are used to find articles that match the research question. After selection and evaluation, 40 relevant articles were included in this study. The findings of this study reveal a variety of imputation approaches and methods used in various research fields, such as social sciences, medicine, economics, and others. Commonly used imputation methods include single imputation, multivariate imputation, and model-based imputation methods. In addition, several studies also describe a combination of imputation methods to deal with more complex situations. The advantage of the imputation method is that it allows researchers to maintain sample sizes and minimize bias in data analysis. However, the research results also show that the imputation method must be applied with caution, because inappropriate imputation decisions can lead to biased results and can affect the accuracy of the research conclusions. In order to increase the validity and reliability of research results, researchers are expected to transparently report the imputation method used and describe the considerations made in the imputation decision-making process. This systematic review of the literature review provides an in-depth view of the use of the imputation method in handling missing values. In the face of the challenge of missing data, an understanding of the various imputation methods and the context in which they are applied will be key to generating meaningful findings in various research fields.

Read full abstract

Model-based Imputation Research Articles

Related Topics

Articles published on Model-based Imputation

Two-stage strategy using denoising autoencoders for robust reference-free genotype imputation with missing input genotypes.

Filling data gaps in long-term solar UV monitoring by statistical imputation methods

In-Database Data Imputation

Abstract TMP95: Context Matters: Geographic Decision-Making Could Improve Prehospital Stroke Triage

Unlocking the Potential of Autism Detection: Integrating Traditional Feature Selection and Machine Learning Techniques

Risk Prediction and Machine Learning: A Case-Based Overview.

A Systematic Literature Review On Missing Values: Research Trends, Datasets, Methods and Frameworks

Implementing Multiple Imputation for Missing Data in Longitudinal Studies When Models are Not Feasible: An Example Using the Random Hot Deck Approach.

Data imputation and comparison of custom ensemble models with existing libraries like XGBoost, CATBoost, AdaBoost and Scikit learn for predictive equipment failure

ScMTD: a statistical multidimensional imputation method for single-cell RNA-seq data leveraging transcriptome dynamic information

Multiple imputation of ordinal missing not at random data

An Adaptive Classification Model for Predicting Epileptic Seizures Using Cloud Computing Service Architecture

Combining farm and household surveys with modelling approaches to improve post-harvest loss estimates and reduce data collection costs

Principal Canonical Correlation Analysis with Missing Data in Small Samples

Compatibility in imputation specification.

Robust estimation of traffic density with missing data using an adaptive-R extended Kalman filter

Attribute Sentiment Scoring with Online Text Reviews: Accounting for Language Structure and Missing Attributes

Exploring Item Bank Stability through Live and Simulated Datasets

DataSifter II: Partially synthetic data sharing of sensitive information containing time-varying correlated observations.

Foundations of Machine Learning-Based Clinical Prediction Modeling: Part III-Model Evaluation and Other Points of Significance.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Model-based Imputation Research Articles

Related Topics

Articles published on Model-based Imputation

Two-stage strategy using denoising autoencoders for robust reference-free genotype imputation with missing input genotypes.

Filling data gaps in long-term solar UV monitoring by statistical imputation methods

In-Database Data Imputation

Abstract TMP95: Context Matters: Geographic Decision-Making Could Improve Prehospital Stroke Triage

Unlocking the Potential of Autism Detection: Integrating Traditional Feature Selection and Machine Learning Techniques

Risk Prediction and Machine Learning: A Case-Based Overview.

A Systematic Literature Review On Missing Values: Research Trends, Datasets, Methods and Frameworks

Implementing Multiple Imputation for Missing Data in Longitudinal Studies When Models are Not Feasible: An Example Using the Random Hot Deck Approach.

Data imputation and comparison of custom ensemble models with existing libraries like XGBoost, CATBoost, AdaBoost and Scikit learn for predictive equipment failure

ScMTD: a statistical multidimensional imputation method for single-cell RNA-seq data leveraging transcriptome dynamic information

Multiple imputation of ordinal missing not at random data

An Adaptive Classification Model for Predicting Epileptic Seizures Using Cloud Computing Service Architecture

Combining farm and household surveys with modelling approaches to improve post-harvest loss estimates and reduce data collection costs

Principal Canonical Correlation Analysis with Missing Data in Small Samples

Compatibility in imputation specification.

Robust estimation of traffic density with missing data using an adaptive-R extended Kalman filter

Attribute Sentiment Scoring with Online Text Reviews: Accounting for Language Structure and Missing Attributes

Exploring Item Bank Stability through Live and Simulated Datasets

DataSifter II: Partially synthetic data sharing of sensitive information containing time-varying correlated observations.

Foundations of Machine Learning-Based Clinical Prediction Modeling: Part III-Model Evaluation and Other Points of Significance.