Over the last few years, substantial research has been conducted towards developing efficient abnormal detection techniques while considering efficiency, accuracy, high-dimensional data, distributed environments, and others. Researchers increasingly deal with “abnormalities” in clinical patient data to derive relevant clinical knowledge for making informed decisions. However, data collection for clinically relevant research is often guided by patient conditions and administrative or clinical requirements rather than a regular schedule. Therefore, clinical data is frequently obtained in an unreliable form, characterized by data outliers and inconsistencies, incomplete information, and an unstructured format that varies based on patient types and data structures. In this research study, an enhanced hybrid AD strategy is developed based on heuristic and stochastic methods to cope with abnormalities in the clinical data of patients. The proposed hybrid strategy employs optimal k-means clustering as a heuristic method to cluster the clinical data based on the patient’s routine exercise characteristics to cope with abnormalities efficiently. Next, an interquartile range-based stochastic approach is employed as a statistical method to detect and eliminate abnormal data points by providing only reliable and effectual data to medical practitioners. The main objective of this research article is to facilitate healthcare and research practitioners by dealing with a high dimensional massive amount of inconsistent and incomplete clinical data of patients to detect and discard anomalous data points for providing only efficacious information. Furthermore, the AutoML paradigm is employed to develop an optimal regression model for analyzing the impact of the proposed hybrid strategy for abnormal pattern detection. In addition, different statistical error estimation measures are used to evaluate the empirical effectiveness of the proposed hybrid strategy using AutoML. The experiment results show a noteworthy improvement in terms of the R2 score for predicting healthcare indicators compared to the existing state-of-the-art regression models. Our optimal regression model performed efficiently regarding the R2 score and MAPE; it achieved an R2 score of 0.9855 and 0.9850 for predicting the Borg RPE and TUG, respectively. Similarly, our model achieved a low prediction error in terms of MAPE for predicting both health functional indicators; it achieved a MAPE of 6.57% and 5.19% for Borg RPE and TUG prediction. Our contribution signifies that the performance of the AutoML improves and outperforms traditional regression models while applying our proposed hybrid abnormal detection model to the patient’s rehabilitation data for accurately dealing with anomalous data.
Read full abstract