Electronic medical records (EMRs) contain large amounts of detailed clinical information. Using medical record review to identify conditions within large quantities of EMRs can be time-consuming and inefficient. EMR-based phenotyping using machine learning and natural language processing algorithms is a continually developing area of study that holds potential for numerous mental health disorders. This review evaluates the current state of EMR-based case identification for depression and provides guidance on using current algorithms and constructing new ones. A scoping review of EMR-based algorithms for phenotyping depression was completed. This research encompassed studies published from January 2000 to May 2023. The search involved 3 databases: Embase, MEDLINE, and APA PsycInfo. This was carried out using selected keywords that fell into 3 categories: terms connected with EMRs, terms connected to case identification, and terms pertaining to depression. This study adhered to the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) guidelines. A total of 20 papers were assessed and summarized in the review. Most of these studies were undertaken in the United States, accounting for 75% (15/20). The United Kingdom and Spain followed this, accounting for 15% (3/20) and 10% (2/20) of the studies, respectively. Both data-driven and clinical rule-based methodologies were identified. The development of EMR-based phenotypes and algorithms indicates the data accessibility permitted by each health system, which led to varying performance levels among different algorithms. Better use of structured and unstructured EMR components through techniques such as machine learning and natural language processing has the potential to improve depression phenotyping. However, more validation must be carried out to have confidence in depression case identification algorithms in general.
Read full abstract