Stroke remains a leading cause of death and disability worldwide, with African populations bearing a disproportionately high burden due to limited healthcare infrastructure. Early prediction and intervention are critical to reducing stroke outcomes. This study developed and evaluated a stroke prediction system using Gated Recurrent Units (GRU), a variant of Recurrent Neural Networks (RNN), leveraging the Afrocentric Stroke Investigative Research and Education Network (SIREN) dataset. The study utilized secondary data from the SIREN dataset, comprising 4236 records with 29 phenotypes. Feature selection reduced these to 15 optimal phenotypes based on their significance to stroke occurrence. The GRU model, designed with 128 input neurons and four hidden layers (64, 32, 16, and 8 neurons), was trained and evaluated using 150 epochs, a batch size of 8, and metrics such as accuracy, AUC, and prediction time. Comparisons were made with traditional machine learning algorithms (Logistic Regression, SVM, KNN) and Long Short-Term Memory (LSTM) networks. The GRU-based system achieved a performance accuracy of 77.48 %, an AUC of 0.84, and a prediction time of 0.43 seconds, outperforming all other models. Logistic Regression achieved 73.58 %, while LSTM reached 74.88 % but with a longer prediction time of 2.23 seconds. Feature selection significantly improved the model's performance compared to using all 29 phenotypes. The GRU-based system demonstrated superior performance in stroke prediction, offering an efficient and scalable tool for healthcare. Future research should focus on integrating unstructured data, validating the model on diverse populations, and exploring hybrid architectures to enhance predictive accuracy.
Read full abstract