Abstract

Clinical pathways are chronological event series that happen throughout a patient's treatment. They can be extracted from the Electronic Health Record medical information and this can be used to correlate the pathway to possible healthcare outcomes. This can be applied to a wide variety of diseases to point pathways related to bad outcomes. These pathways can be audited and patients that start to follow such patterns can be put in special observation and care. Tuberculosis (TB) is one of the leading causes of death through infectious disease and its control is based on search for cases, accurate and premature identification, and treatment. The use of the aforementioned method can help in disease control and premature identifications of bad outcomes for ongoing treatments. Therefore, the current study goals are: 1) identify the existing clinical pathways; 2) group these pathways using hierarchical clustering; 3) create a classification model based on the generated clusters to predict bad outcomes. The dataset used consisted of 277,870 TB treatment cases from the state of São Paulo collected through TBWEB, a information system for monitoring and follow-up of TB cases. All cases with ongoing treatment were excluded from the study and the resulting dataset was splitted in training and test samples. To reduce bias due to imbalance the undersampling technique was applied to the training dataset resulting in a final sample size of 90,184. The test dataset had a size of ​​52,639 cases. Both datasets had 16 attributes describing the patient diagnosis and drug scheme evolution through the treatment. All attributes unique values were mapped and a representation character was assigned to each one. Later, these representation characters were concatenated in the chronological order of the events and diagnosis creating a representational string for the clinical pathway. The resulting pathways of the training dataset were used to build the clusters which were later used to build the classifier to predict the treatment outcome based on the test dataset clinical pathways. The final model overall accuracy is at 0.829. The model showed a significant improvement of accuracy from previous studies and had similar or better performance than others in the literature. We believe this model can be implemented to a informational system to further improve treatments management and tuberculosis control.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call