Autism spectrum disorder (ASD) is a complex neurodevelopmental condition influenced by various genetic and environmental factors. Currently, there is no definitive clinical test, such as a blood analysis or brain scan, for early diagnosis. The objective of this study is to develop a computational model that predicts ASD driver genes in the early stages using genomic data, aiming to enhance early diagnosis and intervention. This study utilized a benchmark genomic dataset, which was processed using feature extraction techniques to identify relevant genetic patterns. Several ensemble classification methods, including Extreme Gradient Boosting, Random Forest, Light Gradient Boosting Machine, ExtraTrees, and a stacked ensemble of classifiers, were applied to assess the predictive power of the genomic features. TheEnsemble Model Predictor for Autism Spectrum Disorder (eNSMBL-PASD) model was rigorously validated using multiple performance metrics such as accuracy, sensitivity, specificity, and Mathew's correlation coefficient. The proposed model demonstrated superior performance across various validation techniques. The self-consistency test achieved 100% accuracy, while the independent set and cross-validation tests yielded 91% and 87% accuracy, respectively. These results highlight the model's robustness and reliability in predicting ASD-related genes. The eNSMBL-PASD model provides a promising tool for the early detection of ASD by identifying genetic markers associated with the disorder. In the future, this model has the potential to assist healthcare professionals, particularly doctors and psychologists, in diagnosing and formulating treatment plans for ASD at its earliest stages.
Read full abstract