Abstract

Distinguishing migraine from stroke is a challenge due to many common signs and symptoms. It is important to consider the cost of hospitalization and the time spent by neurologists and stroke nurses to visit, diagnose, and assign appropriate care to the patients; therefore, devising new ways to distinguish stroke, migraine and other types of mimics can help in saving time and cost, and improve decision-making. In this study, we utilized text and data mining methods to extract the most important predictors from clinical reports in order to establish a migraine detection model and distinguish migraine patients from stroke or other types of mimic (non-stroke) cases. The available data for this study was a heterogeneous mix of free-text fields, such as triage main-complaints and specialist final-impressions, as well as numeric data about patients, such as age, blood-pressure, and so on. After a careful combination of these sources, we obtained a highly imbalanced dataset where the migraine cases were only about 6 % of the dataset. Our main challenge was tackling this data imbalance. Using the dataset in its original form to build classifiers led to a learning bias towards the majority class and against the minority (migraine) class. We used a sampling method to address the imbalance problem. First, different sources of data were preprocessed and balanced datasets were generated; second, attribute selection algorithms were used to reduce the dimensionality of the data; third, a novel combination of data mining algorithms was employed in order to effectively distinguish migraine from other cases. We achieved a sensitivity and specificity of about 80 and 75 %, respectively, which is in contrast to a sensitivity and specificity of 15.7 and 97 % when using the original imbalanced data for building classifiers.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call