Natural Language Processing Methods Research Articles

Background:Suicide is a leading cause of death worldwide, making early identification of suicidal behaviors crucial for clinicians. Current Natural Language Processing (NLP) approaches for identifying suicidal behaviors in Electronic Health Records (EHRs) rely on keyword searches, rule-based methods, and binary classification, which may not fully capture the complexity and spectrum of suicidal behaviors. This study aims to create a multi-class labeled dataset with annotation guidelines and develop a novel NLP approach for fine-grained, multi-label classification of suicidal behaviors, improving the efficiency of the annotation process and accuracy of the NLP methods. Methods:We develop a multi-class labeling system based on guidelines from FDA, CDC, and WHO, distinguishing between six categories of suicidal behaviors and allowing for multiple labels per data sample. To efficiently create an annotated dataset, we use an MPNet-based semantic retrieval framework to extract relevant sentences from a large EHR dataset, reducing annotation space while capturing diverse expressions. Experts annotate the extracted sentences using the multi-class system. We then formulate the task as a multi-label classification problem and fine-tune transformer-based models on the curated dataset to accurately classify suicidal behaviors in EHRs. Results:Lexical analysis revealed key themes in assessing suicide risk, considering an individual’s history, mental health, substance use, and family background. Fine-tuned transformer-based models effectively identified suicidal behaviors from EHRs, with Bio_ClinicalBERT, BioBERT, and XLNet achieving the F1 scores (0.81), outperforming BERT and RoBERTa. The proposed approach, based on a multi-label classification system, captures the complexity of suicidal behaviors effectively particularly “Suicide Attempt” and “Family History” instances. The proposed approach, using task-specific NLP models and a multi-label classification system, captures the complexity of suicidal behaviors more effectively than traditional binary classification. However, direct comparisons with existing studies are difficult due to varying metrics and label definitions. Conclusion:This study presents a robust NLP framework for detecting suicidal behaviors in EHRs, leveraging task-specific fine-tuning of transformer-based models and a semi-automated pipeline. Despite limitations, the approach demonstrates the potential of advanced NLP techniques in enhancing the identification of suicidal behaviors. Future work should focus on model expansion and integration to further improve patient care and clinical decision-making.

Read full abstract

Introduction: Accurately identifying and characterizing patients with hypertrophic cardiomyopathy (HCM) is critical for population management and care optimization. Research Question: To develop natural language processing (NLP) algorithms to identify and characterize obstructive (oHCM) and non-obstructive (nHCM) HCM patients directly from echocardiograms, and to compare with the presence or absence of HCM-related diagnosis codes. Methods: We developed and validated NLP algorithms to identify HCM from all adult (age≥18yrs) echocardiograms performed from 2010-2019 in Kaiser Permanente Northern CA (KPNC), capturing measures of any HCM, HCM subtype, hypertrophy subtype, septal and posterior LV wall thickness, resting and stress/Valsalva LVOT gradients, and systolic anterior motion. We developed a rules-based algorithm (following AHA/ACC criteria) to classify patients as having HCM, including oHCM or nHCM subtypes, and possible HCM (defined as wall thickness ≥2cm without other criteria meeting an HCM definition). We evaluated the presence of HCM-related ICD-9/10 diagnosis codes among patients classified as HCM/non-HCM from echocardiograms using NLP, and linked baseline demographics and clinical parameters from our integrated electronic medical record. Results: Among 472,405 adults with echocardiograms, we identified 2,892 patients with HCM based upon NLP-derived measures (all NLP measures achieved >95% positive predictive value and >95% negative predictive value), including 1,585 (55%) with oHCM, 1,145 (40%) with nHCM, and 162 (6%) which could not be classified (Figure). Among those 2,892 patients, 1,283 did not have any associated HCM ICD-9/10 diagnosis codes (Table). Among 469,513 patients with no identified HCM from NLP-based algorithms, HCM ICD-9/10 diagnosis codes existed in 1,567 patients (Table). We also identified 4,593 patients with possible HCM by NLP, only 4.5% of whom had an associated HCM code. Among confirmed HCM patients by NLP, oHCM patients were slightly older (66 vs 61 yrs), more likely female (53% vs 43%), had similar mean septal wall thickness (1.7cm vs 1.7cm), but were more likely to have a septal hypertrophy subtype (46% vs 28%) compared to nHCM patients. Conclusions: Echocardiogram-based NLP methods can improve the identification of and care for HCM patients. Many patients with possible HCM may be underdiagnosed, representing an opportunity for quality improvement.

Read full abstract

Natural Language Processing Methods Research Articles

Related Topics

Articles published on Natural Language Processing Methods

Intelligent Identification of Cryptographic Ciphers using Machine Learning Techniques

Second language learning of degree expressions: A computational approach

A Framework for Applying Machine Learning and Natural Language Processing Methods to Accounting Recognition

Enhancing suicidal behavior detection in EHRs: A multi-label NLP framework with transformer models and semantic retrieval-based annotation

Analysis of Public Sentiment Regarding the Issue of Cancelling the Revision of the 2024 Regional Election Law with NLP

Harnessing LLMs for Financial Forecasting: A Systematic Review of Advances in Stock Market Prediction and Portfolio Optimization

A Bibliometric Review of Natural Language Processing Applications in Psychology from 1991 to 2023

Scalable Transformer Accelerator with Variable Systolic Array for Multiple Models in Voice Assistant Applications

LCD benchmark: long clinical document benchmark on mortality prediction for language models.

Multitextuality in RPGs: a ludonarrative synergy model for video game text analysis

Perceived unmet needs and impact on quality of life of patients living with advanced bladder cancer and their caregivers: results of a social media listening study conducted in five European countries

Artificial intelligence-enabled social media listening to inform early patient-focused drug development: perspectives on approaches and strategies.

Enhancing zero-shot relation extraction with a dual contrastive learning framework and a cross-attention module

Decoding the Digital Pulse: Bibliometric Analysis of 25 Years in Digital Health Research Through the Journal of Medical Internet Research.

Getting into bed with embeddings? A comparison of collocations and word embeddings for corpus-assisted discourse analysis

Abstract 4140118: Identification of Obstructive and Non-Obstructive Hypertrophic Cardiomyopathy Patients Using Natural Language Processing in a Large Integrated Healthcare System

Protein-Protein Interaction Networks Derived from Classical and Machine Learning-Based Natural Language Processing Tools.

Will Public Health Emergencies Affect Compensatory Consumption Behavior? Evidence from Emotional Eating Perspective.

CommSense: A Wearable Sensing Computational Framework for Evaluating Patient-Clinician Interactions

Screening for Depression Using Natural Language Processing: Literature Review.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Natural Language Processing Methods Research Articles

Related Topics

Articles published on Natural Language Processing Methods

Intelligent Identification of Cryptographic Ciphers using Machine Learning Techniques

Second language learning of degree expressions: A computational approach

A Framework for Applying Machine Learning and Natural Language Processing Methods to Accounting Recognition

Enhancing suicidal behavior detection in EHRs: A multi-label NLP framework with transformer models and semantic retrieval-based annotation

Analysis of Public Sentiment Regarding the Issue of Cancelling the Revision of the 2024 Regional Election Law with NLP

Harnessing LLMs for Financial Forecasting: A Systematic Review of Advances in Stock Market Prediction and Portfolio Optimization

A Bibliometric Review of Natural Language Processing Applications in Psychology from 1991 to 2023

Scalable Transformer Accelerator with Variable Systolic Array for Multiple Models in Voice Assistant Applications

LCD benchmark: long clinical document benchmark on mortality prediction for language models.

Multitextuality in RPGs: a ludonarrative synergy model for video game text analysis

Perceived unmet needs and impact on quality of life of patients living with advanced bladder cancer and their caregivers: results of a social media listening study conducted in five European countries

Artificial intelligence-enabled social media listening to inform early patient-focused drug development: perspectives on approaches and strategies.

Enhancing zero-shot relation extraction with a dual contrastive learning framework and a cross-attention module

Decoding the Digital Pulse: Bibliometric Analysis of 25 Years in Digital Health Research Through the Journal of Medical Internet Research.

Getting into bed with embeddings? A comparison of collocations and word embeddings for corpus-assisted discourse analysis

Abstract 4140118: Identification of Obstructive and Non-Obstructive Hypertrophic Cardiomyopathy Patients Using Natural Language Processing in a Large Integrated Healthcare System

Protein-Protein Interaction Networks Derived from Classical and Machine Learning-Based Natural Language Processing Tools.

Will Public Health Emergencies Affect Compensatory Consumption Behavior? Evidence from Emotional Eating Perspective.

CommSense: A Wearable Sensing Computational Framework for Evaluating Patient-Clinician Interactions

Screening for Depression Using Natural Language Processing: Literature Review.