Abstract

Electronic health records (EHRs) have recently been identified as a potentially valuable source for monitoring adverse drug events (ADEs). However, ADEs are heavily under-reported in EHRs. Using machine learning algorithms to automatically detect patients that should have had ADEs reported in their health records is an efficient and effective solution. One of the challenges to that end is how to take into account temporality when using clinical events, which are time stamped in EHRs, as features for machine learning algorithms to exploit. Previous research on this topic suggests that representing EHR data as a bag of temporally weighted clinical events is promising; however, how to assign weights in an optimal manner remains unexplored. In this study, nine different temporal weighting strategies are proposed and evaluated using data extracted from a Swedish EHR database, where the predictive performance of models constructed with the random forest learning algorithm is compared. Moreover, variable importance is analyzed to obtain a deeper understanding as to why a certain weighting strategy is favored over another, as well as which clinical events undergo the biggest changes in importance with the various weighting strategies. The results show that the choice of weighting strategy has a significant impact on the predictive performance for ADE detection, and that the best choice of weighting strategy depends on the target ADE and, specifically, on its dose-dependency.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call