Defensive Dual Masking for Robust Adversarial Defense
Abstract Adversarial defenses for textual data have gained considerable attention in recent years due to the increasing vulnerability of Natural Language Processing (NLP) models to adversarial attacks. These attacks exploit subtle perturbations in input text to deceive models, posing significant challenges to model robustness and reliability. This paper introduces Defensive Dual Masking (DDM), a simple yet effective algorithm that employs two unique masking strategies to mitigate adversarial threats. Specifically, during training, [MASK] tokens are directly inserted into input samples to prepare the model for handling perturbed inputs. At inference time, suspicious tokens are identified and strategically replaced with [MASK] tokens, effectively neutralizing perturbations while preserving core semantics of the input text. The theoretical foundation of DDM demonstrates how the proposed masking strategies enhance the model capacity to mitigate adversarial attacks. Empirical evaluations based on four benchmark datasets and four adversarial attacks consistently demonstrate that DDM outperforms state-of-the-art defense techniques, achieving superior robustness and substantial improvements in model accuracy. Furthermore, DDM seamlessly integrates with Large Language Models (LLMs), enhancing their resilience to adversarial attacks and providing a scalable defense solution for large-scale NLP applications.
- Research Article
- 10.1007/s00330-024-11148-x
- Oct 31, 2024
- European radiology
Medical reports, governed by HIPAA regulations, contain personal health information (PHI), restricting secondary data use. Utilizing natural language processing (NLP) and large language models (LLM), we sought to employ publicly available methods to automatically anonymize PHI in free-text radiology reports. We compared two publicly available rule-based NLP models (spaCy; NLPac, accuracy-optimized; NLPsp, speed-optimized; iteratively improved on 400 free-text CT-reports (test set)) and one offline LLM approach (LLM-model, LLaMa-2, Meta-AI) for PHI-anonymization. The three models were tested on 100 randomly selected chest CT reports. Two investigators assessed the anonymization of occurring PHI entities and whether clinical information was removed. Subsequently, precision, recall, and F1 scores were calculated. NLPac and NLPsp successfully removed all instances of dates (n = 333), medical record numbers (MRN) (n = 6), and accession numbers (ACC) (n = 92). The LLM model removed all MRNs, 96% of ACCs, and 32% of dates. NLPac was most consistent with a perfect F1-score of 1.00, followed by NLPsp with lower precision (0.86) and F1-score (0.92) for dates. The LLM model had perfect precision for MRNs, ACCs, and dates but the lowest recall for ACC (0.96) and dates (0.52), corresponding F1 scores of 0.98 and 0.68, respectively. Names were removed completely or majorly (i.e., one first or family name non-anonymized) in 100% (NLPac), 72% (NLPsp), and 90% (LLM-model). Importantly, NLPac and NLPsp did not remove medical information, while the LLM model did in 10% (n = 10). Pre-trained NLP models can effectively anonymize free-text radiology reports, while anonymization with the LLM model is more prone to deleting medical information. Question This study compares NLP and locally hosted LLM techniques to ensure PHI anonymization without losing clinical information. Findings Pre-trained NLP models effectively anonymized radiology reports without removing clinical data, while a locally hosted LLM was less reliable, risking the loss of important information. Clinical relevance Fast, reliable, automated anonymization of PHI from radiology reports enables HIPAA-compliant secondary use, facilitating advanced applications like LLM-driven radiology analysis while ensuring ethical handling of sensitive patient data.
- Research Article
1
- 10.1161/circep.124.013023
- Dec 16, 2024
- Circulation. Arrhythmia and electrophysiology
Large language models (LLMs) such as Chat Generative Pre-trained Transformer (ChatGPT) excel at interpreting unstructured data from public sources, yet are limited when responding to queries on private repositories, such as electronic health records (EHRs). We hypothesized that prompt engineering could enhance the accuracy of LLMs for interpreting EHR data without requiring domain knowledge, thus expanding their utility for patients and personalized diagnostics. We designed and systematically tested prompt engineering techniques to improve the ability of LLMs to interpret EHRs for nuanced diagnostic questions, referenced to a panel of medical experts. In 490 full-text EHR notes from 125 patients with prior life-threatening heart rhythm disorders, we asked GPT-4-turbo to identify recurrent arrhythmias distinct from prior events and tested 220 563 queries. To provide context, results were compared with rule-based natural language processing and Bidirectional Encoder Representations from Transformer-based language models. Experiments were repeated for 2 additional LLMs. In an independent hold-out set of 389 notes, GPT-4-turbo had a balanced accuracy of 64.3%±4.7% out-of-the-box at baseline. This increased when asking GPT-4-turbo to provide a rationale for its answers, a structured data output, and in-context exemplars, to a balanced accuracy of 91.4%±3.8% (P<0.05). This surpassed the traditional logic-based natural language processing and BERT-based models (P<0.05). Results were consistent for GPT-3.5-turbo and Jurassic-2 LLMs. The use of prompt engineering strategies enables LLMs to identify clinical end points from EHRs with an accuracy that surpassed natural language processing and approximated experts, yet without the need for expert knowledge. These approaches could be applied to LLM queries for other domains, to facilitate automated analysis of nuanced data sets with high accuracy by nonexperts.
- Research Article
- 10.55041/ijsrem36608
- Aug 10, 2024
- INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT
This research paper delves into the inherent vulnerabilities and potential threats posed by large language models (LLMs), focusing on their implications across diverse applications such as natural language processing and data privacy. The study aims to identify and analyze these risks comprehensively, emphasizing the importance of mitigating strategies to prevent exploitation and misuse in LLM deployments. In recent years, LLMs have revolutionized fields like automated content generation, sentiment analysis, and conversational agents, yet their immense capabilities also raise significant security concerns. Vulnerabilities such as bias amplification, adversarial attacks, and unintended data leakage can undermine trust and compromise user privacy. Through a systematic examination of these challenges, this paper proposes safeguarding measures crucial for responsibly harnessing the potential of LLMs while minimizing associated risks. It underscores the necessity of rigorous security protocols, including robust encryption methods, enhanced authentication mechanisms, and continuous monitoring frameworks. Furthermore, the research discusses regulatory implications and ethical considerations surrounding LLM usage, advocating for transparency, accountability, and stakeholder engagement in policy- making and deployment practices. By synthesizing insights from current literature and real-world case studies, this study provides a comprehensive framework for stakeholders—developers, policymakers, and users—to navigate the complex landscape of LLM security effectively. Ultimately, this research aims to inform future advancements in LLM technology, ensuring its safe and beneficial integration into various domains while mitigating potential risks to individuals and society as a whole. Keywords— Adversarial attacks on LLMs, Bias in LLMs, Data privacy in LLMs, Ethical considerations LLMs, Exploitation of LLMs, Large Language Models (LLMs), Misuse of LLMs, Mitigation strategies for LLMs, Natural Language Processing (NLP), Regulatory frameworks LLMs, Responsible deployment of LLMs, Risks of LLMs, Security implications of LLMs, Threats to LLMs, Vulnerabilities in LLMs.
- Research Article
- 10.1038/s41598-025-08031-0
- Jul 6, 2025
- Scientific Reports
The substantial increase in mental health disorders globally necessitates scalable, accurate tools for detecting and classifying these conditions in digital environments. This study addresses the critical challenge of automated mental health classification by comparing three distinct computational approaches: (1) Traditional Natural Language Processing (NLP) with advanced feature engineering, (2) Prompt-engineered large language models (LLMs), and (3) Fine-tuned LLMs. The dataset consisted of over 51,000 publicly available text statements from social media platforms, tagged with seven mental health conditions: Normal, Depression, Suicidal, Anxiety, Stress, Bipolar Disorder, and Personality Disorder. The dataset was stratified into training, validation, and test sets for model evaluation. The primary outcome was classification accuracy across these seven mental health conditions. Additional metrics like precision, recall, and F1-score were analyzed. We compared the results of the three computational approaches and overfitting was monitored through validation loss across epochs for the fine-tuned LLM. The NLP model with advanced feature engineering achieved an overall accuracy of 95%, surpassing both the prompt-engineered LLM (65%) and the fine-tuned LLM (91%). This model performed exceptionally well in terms of accuracy and precision. While fine-tuning for three epochs yielded optimal results, further training led to overfitting and decreased performance. This study demonstrates the significant benefits of applying advanced text preprocessing and feature engineering techniques to traditional NLP models, alongside fine-tuning LLMs, such as GPT-4o-mini, for mental health classification tasks. The results clearly indicate that off-the-shelf LLM chatbots using prompt engineering are inadequate for mental health classification, performing 30% points below specialized NLP approaches. Despite the popularity of general-purpose LLMs, specialized approaches remain superior for critical healthcare applications like mental health classification.
- Research Article
- 10.1093/bjrai/ubaf010
- Aug 13, 2025
- BJR|Artificial Intelligence
Natural Language Processing (NLP) is a key technique for developing Medical Artificial Intelligence (AI) systems that leverage Electronic Health Record (EHR) data to build diagnostic and prognostic models. NLP enables the conversion of unstructured clinical text into structured data that can be fed into AI algorithms. The emergence of transformer architecture and large language models (LLMs) has led to advances in NLP for various healthcare tasks, such as entity recognition, relation extraction, sentence similarity, text summarization, and question-answering. In this article, we review the major technical innovations that underpin modern NLP models and present state-of-the-art NLP applications that employ LLMs in radiation oncology research. However, it is crucial to recognize that LLMs are prone to hallucinations, biases, and ethical violations, which necessitate rigorous evaluation and validation prior to clinical deployment. As such, we propose a comprehensive framework for assessing the NLP models based on their purpose and clinical fit, technical performance, bias and trust, legal and ethical implications, and quality assurance prior to implementation in clinical radiation oncology. Our article aims to provide guidance and insights for researchers and clinicians who are interested in developing and using NLP models in clinical radiation oncology. Natural Language Processing (NLP) is a key technique for developing Medical Artificial Intelligence (AI) systems that leverage Electronic Health Record (EHR) data to build diagnostic and prognostic models. NLP enables the conversion of unstructured clinical text into structured data that can be fed into AI algorithms. The emergence of transformer architecture and large language models (LLMs) has led to advances in NLP for various healthcare tasks, such as entity recognition, relation extraction, sentence similarity, text summarization, and question-answering. In this article, we review the major technical innovations that underpin modern NLP models and present state-of-the-art NLP applications that employ LLMs in radiation oncology research. However, it is crucial to recognize that LLMs are prone to hallucinations, biases, and ethical violations, which necessitate rigorous evaluation and validation prior to clinical deployment. As such, we propose a comprehensive framework for assessing the NLP models based on their purpose and clinical fit, technical performance, bias and trust, legal and ethical implications, and quality assurance prior to implementation in clinical radiation oncology. Our article aims to provide guidance and insights for researchers and clinicians who are interested in developing and using NLP models in clinical radiation oncology.
- Research Article
- 10.1101/2025.05.23.25328115
- Sep 26, 2025
- medRxiv
ABSTRACTContextGoals-of-care (GOC) discussions and their documentation are important process measures in palliative care. However, existing natural language processing (NLP) models for identifying such documentation require costly task-specific training data. Large language models (LLMs) hold promise for measuring such constructs with fewer or no task-specific training data.ObjectiveTo evaluate the performance of a publicly available LLM with no task-specific training data (zero-shot prompting) for identifying documented GOC discussions.MethodsWe compared performance of two NLP models in identifying documented GOC discussions: Llama 3.3 using zero-shot prompting; and, a task-specific BERT (Bidirectional Encoder Representations from Transformers)-based model trained on 4,642 manually annotated notes. We tested both models on records from a series of clinical trials enrolling adult patients with chronic life-limiting illness hospitalized over 2018-2023. We evaluated the area under the receiver operating characteristic curve (AUC), area under the precision-recall curve (AUPRC), and maximal F1score, for both note-level and patient-level classification over a 30-day period.ResultsIn our text corpora, GOC documentation represented <1% of text and was found in 7.3-9.9% of notes for 23-37% of patients. In a 617-patient held-out test set, Llama 3.3 (zero-shot) and BERT (task-specific, trained) exhibited comparable performance in identifying GOC documentation (Llama 3.3: AUC 0.979, AUPRC 0.873, and F10.83; BERT: AUC 0.981, AUPRC 0.874, and F10.83).ConclusionA zero-shot large language model with no task-specific training performed similarly to a task-specific trained BERT model in identifying documented goals-of-care discussions. This demonstrates the promise of LLMs in measuring novel clinical research outcomes.KEY MESSAGEThis article reports the performance of a publicly available large language model with no task-specific training data in measuring the occurrence of documented goals-of-care discussions from electronic health records. The study demonstrates that newer large language AI models may allow investigators to measure novel outcomes without requiring costly training data.
- Research Article
22
- 10.1016/j.ajic.2024.03.016
- Apr 6, 2024
- AJIC: American Journal of Infection Control
Utilizing natural language processing and large language models in the diagnosis and prediction of infectious diseases: A systematic review
- Research Article
- 10.1007/s10143-025-03785-7
- Sep 5, 2025
- Neurosurgical review
Natural language processing (NLPs) and Large language models (LLM), such as ChatGPT, represent transformative advancements in artificial intelligence (AI). Their implementation into the medical field has a broad potential, and this review discusses the current trends and prospects of NLPs and LLMs in spine surgery, assessing their potential benefits, applications, and limitations. The methodology involved a comprehensive narrative review of existing English literature related to the use of NLPs and LLMs in spine surgery. We searched the databases PubMed, EMBASE, Web of Science and Scopus from inception until 16th June 2025 using keywords evolving around LLM, natural language processing and spine surgery. Original studies, clinical reports, and case series were included, while abstracts or unpublished studies were excluded. From 221 initial records, 37 studies were included: 18 evaluated LLMs and 19 evaluated NLP-based tools. LLMs were commonly used for clinical decision-making (n = 8), patient counseling (n = 7), classification (n = 2), and in research (n = 1). NLPs were applied in classification tasks (n = 12), clinical decision-making (n = 3), patient counseling (n = 1), postoperative opioid monitoring (n = 2), and research registry development (n = 1). ChatGPT-4 achieved up to 92% accuracy in clinical recommendations, outperforming GPT-3.5 in multiple tasks. Comparative analyses have found that newer versions of LLMs, such as ChatGPT-4, outperform previous versions, evident by greater accuracy and to a lesser extent of artificial hallucination. However, limitations persist, including overconfident outputs, adherence gaps to clinical guidelines, and inconsistent patient readability. While this review suggests that NLPs and LLMs can have a significant impact on spine practice, it is important to keep their limitations in mind and implement them with caution. To maximize the benefits of these models in spine surgery, future research should focus on improving model sensitivity and specificity, promoting multi-disciplinary collaborations, and addressing ethical considerations regarding the use of language models in medical practice, including the inherent issue of hallucination of these models.
- Research Article
- 10.1145/3749840
- Jul 22, 2025
- ACM Transactions on Software Engineering and Methodology
Automated log analysis is crucial to ensure the high availability and reliability of complex systems. The advent of large language models (LLMs) in natural language processing (NLP) has ushered in a new era of language model-driven automated log analysis, garnering significant interest. Within this field, two primary paradigms based on language models for log analysis have become prominent. Small Language Models (SLMs) (such as BERT) follow the pre-train and fine-tune paradigm, focusing on the specific log analysis task through fine-tuning on supervised datasets. On the other hand, LLMs (such as ChatGPT) following the in-context learning paradigm, analyze logs by providing a few examples in prompt contexts without updating parameters. Despite their respective strengths, both models exhibit inherent limitations. By comparing SLMs and LLMs, we notice that SLMs are more cost-effective but less powerful, whereas LLMs with large parameters are highly powerful but expensive and inefficient. To trade-off between the performance and inference costs of both models in automated log analysis, this paper introduces an adaptive log analysis framework known as AdaptiveLog, which effectively reduces the costs associated with LLM while ensuring superior results. This framework collaborates an LLM and a small language model, strategically allocating the LLM to tackle complex logs while delegating simpler logs to the SLM. Specifically, to efficiently query the LLM, we propose an adaptive selection strategy based on the uncertainty estimation of the SLM, where the LLM is invoked only when the SLM is uncertain. In addition, to enhance the reasoning ability of the LLM in log analysis tasks, we propose a novel prompt strategy by retrieving similar error-prone cases as the reference, enabling the model to leverage past error experiences and learn solutions from these cases. We evaluate AdaptiveLog on different log analysis tasks, extensive experiments demonstrate that AdaptiveLog achieves state-of-the-art results across different tasks, elevating the overall accuracy of log analysis while maintaining cost efficiency. Our source code and detailed experimental data are available at https://github.com/LeaperOvO/AdaptiveLog-review .
- Research Article
11
- 10.3390/bdcc8060063
- Jun 5, 2024
- Big Data and Cognitive Computing
Cryptocurrencies are becoming increasingly prominent in financial investments, with more investors diversifying their portfolios and individuals drawn to their ease of use and decentralized financial opportunities. However, this accessibility also brings significant risks and rewards, often influenced by news and the sentiments of crypto investors, known as crypto signals. This paper explores the capabilities of large language models (LLMs) and natural language processing (NLP) models in analyzing sentiment from cryptocurrency-related news articles. We fine-tune state-of-the-art models such as GPT-4, BERT, and FinBERT for this specific task, evaluating their performance and comparing their effectiveness in sentiment classification. By leveraging these advanced techniques, we aim to enhance the understanding of sentiment dynamics in the cryptocurrency market, providing insights that can inform investment decisions and risk management strategies. The outcomes of this comparative study contribute to the broader discourse on applying advanced NLP models to cryptocurrency sentiment analysis, with implications for both academic research and practical applications in financial markets.
- Research Article
- 10.54254/2755-2721/97/20241406
- Nov 26, 2024
- Applied and Computational Engineering
Abstract. Large language models (LLMs) have revolutionized the field of natural language processing (NLP), demonstrating remarkable capabilities in understanding, generating, and manipulating human language. This comprehensive review explores the development, applications, optimizations, and challenges of LLMs. This paper begin by tracing the evolution of these models and their foundational architectures, such as the Transformer, GPT, and BERT. We then delve into the applications of LLMs in natural language understanding tasks, including sentiment analysis, named entity recognition, question answering, and text summarization, highlighting real-world use cases. Next, we examine the role of LLMs in natural language generation, covering areas such as content creation, language translation, personalized recommendations, and automated responses. We further discuss LLM applications in other NLP tasks like text style transfer, text correction, and language model pre-training. Subsequently, we explore techniques for optimizing and improving LLMs, including model compression, explainability, robustness, and security. Finally, we address the challenges posed by the significant computational requirements, sample inefficiency, and ethical considerations surrounding LLMs. We conclude by discussing potential future research directions, such as efficient architectures, few-shot learning, bias mitigation, and privacy-preserving techniques, which will shape the ongoing development and responsible deployment of LLMs in NLP.
- Research Article
- 10.1016/j.jpainsymman.2025.09.025
- Oct 1, 2025
- Journal of pain and symptom management
Assessment of a Zero-Shot Large Language Model in Measuring Documented Goals-of-Care Discussions.
- Research Article
2
- 10.3389/fmed.2024.1512824
- Jan 22, 2025
- Frontiers in medicine
In the last years, natural language processing (NLP) has transformed significantly with the introduction of large language models (LLM). This review updates on NLP and LLM applications and challenges in gastroenterology and hepatology. Registered with PROSPERO (CRD42024542275) and adhering to PRISMA guidelines, we searched six databases for relevant studies published from 2003 to 2024, ultimately including 57 studies. Our review of 57 studies notes an increase in relevant publications in 2023-2024 compared to previous years, reflecting growing interest in newer models such as GPT-3 and GPT-4. The results demonstrate that NLP models have enhanced data extraction from electronic health records and other unstructured medical data sources. Key findings include high precision in identifying disease characteristics from unstructured reports and ongoing improvement in clinical decision-making. Risk of bias assessments using ROBINS-I, QUADAS-2, and PROBAST tools confirmed the methodological robustness of the included studies. NLP and LLMs can enhance diagnosis and treatment in gastroenterology and hepatology. They enable extraction of data from unstructured medical records, such as endoscopy reports and patient notes, and for enhancing clinical decision-making. Despite these advancements, integrating these tools into routine practice is still challenging. Future work should prospectively demonstrate real-world value.
- Conference Article
1
- 10.18653/v1/2021.repl4nlp-1.32
- Jan 1, 2021
The adoption of Transformer-based models in natural language processing (NLP) has led to great success using a massive number of parameters. However, due to deployment constraints in edge devices, there has been a rising interest in the compression of these models to improve their inference time and memory footprint. This paper presents a novel loss objective to compress token embeddings in the Transformer-based models by leveraging an AutoEncoder architecture. More specifically, we emphasize the importance of the direction of compressed embeddings with respect to original uncompressed embeddings. The proposed method is task-agnostic and does not require further language modeling pre-training. Our method significantly outperforms the commonly used SVD-based matrix-factorization approach in terms of initial language model Perplexity. Moreover, we evaluate our proposed approach over SQuAD v1.1 dataset and several downstream tasks from the GLUE benchmark, where we also outperform the baseline in most scenarios. Our code is public.
- Research Article
- 10.1080/2050571x.2025.2514395
- Jan 2, 2025
- Speech, Language and Hearing
Coding the accuracy of typed transcripts from experiments testing speech intelligibility is an arduous endeavour. A recent study in this journal [Herrmann, B. 2025. Leveraging natural language processing models to automate speech-intelligibility scoring. Speech, Language and Hearing, 28(1)] presents a novel approach for automating the scoring of such listener transcripts, leveraging Natural Language Processing (NLP) models. It involves the calculation of the semantic similarity between transcripts and target sentences using high-dimensional vectors, generated by such NLP models as ADA2, GPT2, BERT, and USE. This approach demonstrates exceptional accuracy, with negligible underestimation of intelligibility scores (by about 2-4%), numerically outperforming simpler computational tools like Autoscore and TSR. The method uniquely relies on semantic representations generated by large language models. At the same time, these models also form the Achilles heel of the technique: the transparency, accessibility, data security, ethical framework, and cost of the selected model directly impact the suitability of the NLP-based scoring method. Hence, working with such models can raise serious risks regarding the reproducibility of scientific findings. This in turn emphasises the need for fair, ethical, and evidence-based open source models. With such models, Herrmann’s new tool represents a valuable addition to the speech scientist’s toolbox.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.