Abstract

This study presents a comparative assessment of supervised machine learning (ML) methods to capture outcomes and patients' characteristics from electronic health records (EHR). We explored automatic classification of free-text data from EHRs to support a value-based program. The study encompasses a computational problem of information extraction and automatic text classification. We identified essential tasks to be considered in a stroke value-based program. 30 selected tasks were classified (manually labeled by specialists) according to the value agenda: Tier 1 (healthcare status achieved ), Tier 2 (recovery process), care-related (clinical management and risk scores), and baseline characteristics. We used 44206 sentences from free-text medical records in Portuguese to train and develop 11 distinct supervised computational ML methods, along with ontological rules created. As experimental protocol, we used 5-fold cross-validation procedure repeated 6 times and subject-wise sampling. A heatmap was used to display comparative result analysis, according to their performance (F1-score), supported by statistical significance tests. The best models were composed of Support Vector Machines trained with lexical and semantic textual features. The models statistically covered a total of 20 tasks (67%) with F1 score > 80 regarding care-related tasks (treatment location, fall risk, thrombolytic therapy, pressure ulcer risk), recovery process (ability feeding orally/ ambulate/communicate), status achieved (mortality) and baseline characteristics (diabetes, obesity, dyslipidemia, smoking status). Ontological rules also were effective covering tasks as baseline characteristics (alcoholism, atrial fibrillation, and coronary artery disease) and Rankin scale. Complementarity in performance among results suggests a combination of models could enhance the result and cover more tasks. Our experiment recognizes algorithms to classify outcomes and clinical characteristics of patients who suffered stroke, with reliable performance to implement real-time outcome measurement. The opportunity to establish models built on EHR-based in Portuguese represents contribution to the value element that emphasizes the importance of advancing technological and informational capacity.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.