BackgroundHospital readmission is an important indicator of inpatient care quality and a significant driver of increasing medical costs. Therefore, it is important to explore the effects of postdischarge information, particularly from home healthcare notes, on enhancing readmission prediction models. Despite the use of natural language processing (NLP) and machine learning in prediction model development, current studies often overlook insights from home healthcare notes. ObjectiveThis study aimed to develop prediction models for 30-day readmissions using home healthcare notes and structured data. In addition, it explored the development of 14- and 180-day prediction models using variables in the 30-day model. DesignA retrospective observational cohort study. Setting(s)This study was conducted at Ajou University School of Medicine in South Korea. ParticipantsData from electronic health records, encompassing demographic characteristics of 1819 participants, along with information on conditions, drug, and home healthcare, were utilized. MethodsTwo distinct models were developed for each prediction window (30-, 14-, 180-day): the traditional model, which utilized structured variables alone, and the common data model (CDM)-NLP model, which incorporated structured and topic variables extracted from home healthcare notes. BERTopic facilitated topic generation and risk probability, representing the likelihood of documents being assigned to specific topics. Feature selection involved experimenting with various algorithms. The best-performing algorithm, determined using the area under the receiver operating characteristic curve (AUROC), was used for model development. Model performance was assessed using various learning metrics including AUROC. ResultsAmong 1819 patients, 251 (13.80 %) experienced 30-day readmission. The least absolute shrinkage and selection operator was used for feature extraction and model development. The 15 structured features were used in the traditional model. Moreover, five additional topic variables from the home healthcare notes were applied in the CDM-NLP model. The AUROC of the traditional model was 0.739 (95 % CI: 0.672–0.807). The AUROC of the CDM-NLP model was high at 0.824 (95 % CI: 0.768–0.880), which indicated an outstanding performance. The topics in the CDM-NLP model included emotional distress, daily living functions, nutrition, postoperative status, and cardiorespiratory issues. In extended prediction model development for 14- and 180-day readmissions, the CDM-NLP consistently outperformed the traditional model. ConclusionsThis study developed effective prediction models using both structured and unstructured data, thereby emphasizing the significance of postdischarge information from home healthcare notes in readmission prediction.
Read full abstract