Accelerate Literature Icon
Want to do a literature review? Try our new Literature Review workflow

Understanding Memory-related Threats and Vulnerabilities in Large Language Models

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Memory characteristics in large language models (LLMs) represent a transformative progress that enables relevant continuity, privatization, and adaptive learning in interactions. However, these capabilities introduce novel security vulnerabilities that extend beyond traditional concerns. This article examines the security implications of memory-enabled LLMs, categorizing architectural approaches and identifying distinct vulnerability classes, including temporal prompt injection, information persistence, and memory poisoning. Through documented case studies and empirical evidence, the article illustrates how these vulnerabilities manifest in production environments, leading to data leakage, system manipulation, and knowledge corruption. The article proposes comprehensive security frameworks incorporating memory segregation, temporal constraints, bidirectional filtering, differential privacy, and advanced auditing mechanisms. Since LLMS develops from stateless tools to constant assistants, safety paradigms must expand the traditional boundaries to address the entire memory lifestyle and ensure that these systems remain both functional and safe in sensitive operating contexts.

Similar Papers
  • Research Article
  • 10.28945/5693
Unlocking the Potential of Large Language Models in Education: Factors Influencing Adoption by Instructional Designers and Academics
  • Jan 1, 2026
  • Journal of Information Technology Education: Research
  • Katherine L Fourie + 2 more

Aim/Purpose: The study investigates the factors influencing the acceptance and utilisation of large language models (LLMs) (predictor variables of LLM usage), such as ChatGPT, in Learning design by instructional designers and university-teaching academics from various countries. Background: Large language models (LLMs) have exploded onto the scene, transforming the landscape of learning design. Instructional designers and university teaching academics have been overburdened with content creation for their teaching programmes, and the arrival of LLM models will help in this regard by developing more interactive content that drives student engagement and, in turn, contributes to student success. Since LLMs are a relatively new phenomenon, little is known about the factors influencing their acceptance in learning design; therefore, this research is needed, as learning design principles are the bedrock of student engagement and success. Methodology: A cross-sectional correlational quantitative study was employed. Data was collected using an online questionnaire posted on social media, including LinkedIn, from 203 instructional designers and university teaching academics. Purposive and snowball sampling methods were used to target instructional designers and university teaching academics at colleges and universities worldwide. Participants were asked to share the survey link with fellow instructional designers and university-teaching academics in their communities. The factor structure of the data was determined using exploratory factor analysis. Nonetheless, the factor structure derived from the LLMs did not entirely reflect the original configuration of the Unified Theory of Acceptance and Use of Technology (UTAUT3), as certain predictors appeared to coalesce, indicating LLMs’ unique nature in learning design. Confirmatory factor analysis was used to verify the fit of the data on the measurement model. First-order and second-order structural modelling were used to identify the structural relationships among the variables. Contribution: The study determines significant factors for the acceptance of LLMs by instructional designers and academic teaching staff in learning design, enabling possible opportunities for best practices in the field through interventions to optimize LLM usage. The study applies the technology acceptance model to the emerging LLM technology and extends the technology acceptance model by adding the trust construct as a predictor variable. Findings: The structural analysis results indicated that the ingrained LLM practices, LLM peer-driven expectations, innovative propensity towards LLM adoption, reliability and provider trust in LLMs, and ease of use and support influenced perceived LLM benefits and usage, but community standards and infrastructure had no influence. The second-order structural equation modelling indicated that perceived LLM benefits and usage and ingrained LLM habits contributed most to the learning design. Recommendations for Practitioners: Teaching academics and instructional designers must use LLMs in designing content, assessments, and interactive learning activities, and attend LLM training workshops on prompting and best practices in integrating LLMs into learning and teaching to see their benefits; hence, regular use of LLMs will then lead to trust and innovation in LLMs usage, enhancing learning design and improving student learning outcomes. Recommendation for Researchers: Researchers must use mixed methods approaches to have a deeper understanding of the factors influencing LLMs. Since habit and perceived LLM benefits and usage contributed the most variance to learning design, researchers must investigate strategies that optimise these factors in learning design, such as effective intervention strategies that can help form positive LLM habits. In addition, the findings provide researchers with a starting point for future research. Further researchers must investigate interventions that optimise the influence of personal innovativeness and trust that contributed the least variance to learning design, hence unlocking the potential of LLMs in learning design through innovation, responsible, and ethical use. Impact on Society: The use of LLMs in learning design has a high possibility of transforming education, specifically the learning design landscape. Using LLMs will free up more time for teaching academics and instructional designers so that they spend more time on higher-order thinking skill demands. Consequently, the students will be exposed to more engaging and interactive content, resulting in improved learning outcomes. Future Research: Future research must include context-derived external variables in technology acceptance models, such as levels of prompting competencies, to provide a deeper understanding of LLMs. In addition, future research must be based on the application and impact of LLMs on student engagement and success, and their attainment of 21st-century skills.

  • Research Article
  • Cite Count Icon 36
  • 10.1145/3735633
Continual Learning of Large Language Models: A Comprehensive Survey
  • Nov 20, 2025
  • ACM Computing Surveys
  • Haizhou Shi + 8 more

The challenge of effectively and efficiently adapting statically pre-trained Large Language Models (LLMs) to ever-evolving data distributions remains predominant. When tailored for specific needs, pre-trained LLMs often suffer from significant performance degradation in previous knowledge domains—a phenomenon known as “catastrophic forgetting” . While extensively studied in the Continual Learning (CL) community, this problem presents new challenges in the context of LLMs. In this survey, we provide a comprehensive overview and detailed discussion of the current research progress on LLMs within the context of CL. Besides the introduction of the preliminary knowledge, this survey is structured into four main sections: we first describe an overview of continually learning LLMs, consisting of two directions of continuity: vertical continuity (or vertical continual learning) , i.e., continual adaptation from general to specific capabilities, and horizontal continuity (or horizontal continual learning) , i.e., continual adaptation across time and domains (Section 3 ). Following vertical continuity, we summarize three stages of learning LLMs in the context of modern CL: Continual Pre-Training (CPT), Domain-Adaptive Pre-training (DAP), and Continual Fine-Tuning (CFT) (Section 4 ). We then provide an overview of evaluation protocols for continual learning with LLMs, along with currently available data sources (Section 5 ). Finally, we discuss intriguing questions related to continual learning for LLMs (Section 6 ). This survey sheds light on the relatively understudied domain of continually pre-training, adapting, and fine-tuning large language models, suggesting the necessity for greater attention from the community. Key areas requiring immediate focus include the development of practical and accessible evaluation benchmarks, along with methodologies specifically designed to counter forgetting and enable knowledge transfer within the evolving landscape of LLM learning paradigms. The full list of articles examined in this survey is available at https://github.com/Wang-ML-Lab/llm-continual-learning-survey.

  • Conference Article
  • Cite Count Icon 45
  • 10.1109/icdmw58026.2022.00078
EW-Tune: A Framework for Privately Fine-Tuning Large Language Models with Differential Privacy
  • Nov 1, 2022
  • IEEE International Conference on Data Mining Workshops, ICDMW
  • Rouzbeh Behnia + 3 more

Pre-trained Large Language Models (LLMs) are an integral part of modern AI that have led to breakthrough performances in complex AI tasks. Major AI companies with expensive infrastructures are able to develop and train these large models with billions and millions of parameters from scratch. Third parties, researchers, and practitioners are increasingly adopting these pre-trained models and fine-tuning them on their private data to accomplish their downstream AI tasks. However, it has been shown that an adversary can extract/reconstruct the exact training samples from these LLMs, which can lead to revealing personally identifiable information. The issue has raised deep concerns about the privacy of LLMs. Differential privacy (DP) provides a rigorous framework that allows adding noise in the process of training or fine-tuning LLMs such that extracting the training data becomes infeasible (i.e., with a cryptographically small success probability). While the theoretical privacy guarantees offered in most extant studies assume learning models from scratch through many training iterations in an asymptotic setting, this assumption does not hold in fine-tuning scenarios in which the number of training iterations is significantly smaller. To address the gap, we present \ewtune, a DP framework for fine-tuning LLMs based on Edgeworth accountant with finite-sample privacy guarantees. Our results across four well-established natural language understanding (NLU) tasks show that while \ewtune~adds privacy guarantees to LLM fine-tuning process, it directly contributes to decreasing the induced noise to up to 5.6\% and improves the state-of-the-art LLMs performance by up to 1.1\% across all NLU tasks. We have open-sourced our implementations for wide adoption and public testing purposes.

  • Research Article
  • Cite Count Icon 16
  • 10.1049/blc2.12091
Privacy preserving large language models: ChatGPT case study based vision and framework
  • Nov 17, 2024
  • IET Blockchain
  • Imdad Ullah + 7 more

The generative Artificial Intelligence (AI) tools based on Large Language Models (LLMs) use billions of parameters to extensively analyse large datasets and extract critical information such as context, specific details, identifying information, use this information in the training process, and generate responses for the requested queries. The extracted data also contain sensitive information, seriously threatening user privacy and reluctance to use such tools. This article proposes the conceptual model called PrivChatGPT, a privacy‐preserving model for LLMs consisting of two main components, that is, preserving user privacy during the data curation/pre‐processing and preserving private context and the private training process for large‐scale data. To demonstrate the applicability of PrivChatGPT, it is shown how a private mechanism could be integrated into the existing model for training LLMs to protect user privacy; specifically, differential privacy and private training using Reinforcement Learning (RL) were employed. The privacy level probabilities are associated with the document contents, including the private contextual information, and with metadata, which is used to evaluate the disclosure probability loss for an individual's private information. The privacy loss is measured and the measure of uncertainty or randomness is evaluated using entropy once differential privacy is applied. It recursively evaluates the level of privacy guarantees and the uncertainty of public databases and resources during each update when new information is added for training purposes. To critically evaluate the use of differential privacy for private LLMs, other mechanisms were hypothetically compared such as Blockchain, private information retrieval, randomisation, obfuscation, anonymisation, and the use of Tor for various performance measures such as the model performance and accuracy, computational complexity, privacy vs. utility, training latency, vulnerability to attacks, and resource consumption. It is concluded that differential privacy, randomisation, and obfuscation can impact the training models' utility and performance; conversely, using Tor, Blockchain, and Private Information Retrieval (PIR) may introduce additional computational complexity and high training latency. It is believed that the proposed model could be used as a benchmark for privacy‐preserving LLMs for generative AI tools.

  • Research Article
  • 10.70232/jtal.v2i1.22
Impact of Large Language Models on Personalized Learning, Assessment Automation, and Student Outcomes in Higher Learning Institution
  • Mar 16, 2026
  • Journal of Technology-Assisted Learning
  • Onesme Niyibizi

This study investigated the multifaceted influence of Large Language Models (LLMs) on teaching and learning within a private higher education institution in Rwanda during the 2024–2025 academic year. A total of 658 students and 28 lecturers participated, providing a comprehensive perspective on both user experiences and professional concerns. Using a quantitative approach, the study employed Multivariate Analysis of Variance (MANOVA) to examine how the use of LLMs relates to students’ perceptions of personalized learning effectiveness, academic performance improvement, online engagement, satisfaction with assessment feedback, and motivation for lifelong learning. Findings from the student indicated that LLMs are widely perceived as beneficial across multiple dimensions of the learning process. Students reported that LLMs enhance personalized learning by providing adaptive guidance, improving academic performance through instant clarification and practice support, and increasing online engagement by offering interactive and accessible learning assistance. The results further showed that LLMs contribute to greater satisfaction with feedback mechanisms and stimulate motivation for continuous and self-directed learning. These statistically significant associations point to the strong potential of LLMs to enrich higher education outcomes. In contrast, the lecturers’ data revealed notable concerns related to data privacy, ethical use, and algorithmic bias. Lecturers expressed significant apprehension regarding students’ overreliance on LLMs, the risks associated with inaccurate or biased outputs, and the potential erosion of academic integrity. Their perceptions underscore the need for safeguards that ensure responsible and ethical use of AI in academic settings. Overall, the findings highlighted a dual reality: while LLMs hold transformative potential for improving learning experiences, their integration must be supported by robust institutional policies, targeted capacity-building initiatives, and ongoing research. Such measures are essential to promote equitable, ethical, and effective adoption of LLMs in higher education.

  • Research Article
  • Cite Count Icon 2
  • 10.1109/tmi.2025.3581108
LMT++: Adaptively Collaborating LLMs With Multi-Specialized Teachers for Continual VQA in Robotic Surgical Videos.
  • Jan 1, 2025
  • IEEE transactions on medical imaging
  • Yuyang Du + 8 more

Visual question answering (VQA) plays a vital role in advancing surgical education. However, due to the privacy concern of patient data, training VQA model with previously used data becomes restricted, making it necessary to use the exemplar-free continual learning (CL) approach. Previous CL studies in the surgical field neglected two critical issues: i) significant domain shifts caused by the wide range of surgical procedures collected from various sources, and ii) the data imbalance problem caused by the unequal occurrence of medical instruments or surgical procedures. This paper addresses these challenges with a multimodal large language model (LLM) and an adaptive weight assignment strategy. First, we developed a novel LLM-assisted multi-teacher CL framework (named LMT++), which could harness the strength of a multimodal LLM as a supplementary teacher. The LLM's strong generalization ability, as well as its good understanding of the surgical domain, help to address the knowledge gap arising from domain shifts and data imbalances. To incorporate the LLM in our CL framework, we further proposed an innovative approach to process the training data, which involves the conversion of complex LLM embeddings into logits value used within our CL training framework. Moreover, we design an adaptive weight assignment approach that balances the generalization ability of the LLM and the domain expertise of conventional VQA models obtained in previous model training processes within the CL framework. Finally, we created a new surgical VQA dataset for model evaluation. Comprehensive experimental findings on these datasets show that our approach surpasses state-of-the-art CL methods.

  • Conference Article
  • Cite Count Icon 16
  • 10.1145/3658644.3690298
PromSec: Prompt Optimization for Secure Generation of Functional Source Code with Large Language Models (LLMs)
  • Dec 2, 2024
  • Mahmoud Nazzal + 3 more

The capability of generating high-quality source code using large language models (LLMs) reduces software development time and costs. However, they often introduce security vulnerabilities due to training on insecure open-source data. This highlights the need for ensuring secure and functional code generation. This paper introduces PromSec, an algorithm for prom optimization for secure and functioning code generation using LLMs. In PromSec, we combine 1) code vulnerability clearing using a generative adversarial graph neural network, dubbed as gGAN, to fix and reduce security vulnerabilities in generated codes and 2) code generation using an LLM into an interactive loop, such that the outcome of the gGAN drives the LLM with enhanced prompts to generate secure codes while preserving their functionality. Introducing a new contrastive learning approach in gGAN, we formulate code-clearing and generation as a dual-objective optimization problem, enabling PromSec to notably reduce the number of LLM inferences. PromSec offers a cost-effective and practical solution for generating secure, functional code. Extensive experiments conducted on Python and Java code datasets confirm that PromSec effectively enhances code security while upholding its intended functionality. Our experiments show that while a state-of-the-art approach fails to address all code vulnerabilities, PromSec effectively resolves them. Moreover, PromSec achieves more than an order-of-magnitude reduction in operation time, number of LLM queries, and security analysis costs. Furthermore, prompts optimized with PromSec for a certain LLM are transferable to other LLMs across programming languages and generalizable to unseen vulnerabilities in training. This study is a step in enhancing the trustworthiness of LLMs for secure and functional code generation, supporting their integration into real-world software development.

  • Research Article
  • Cite Count Icon 9
  • 10.1152/advan.00137.2024
Accuracy and reliability of large language models in assessing learning outcomes achievement across cognitive domains.
  • Dec 1, 2024
  • Advances in physiology education
  • Swapna Haresh Teckwani + 3 more

The advent of artificial intelligence (AI), particularly large language models (LLMs) like ChatGPT and Gemini, has significantly impacted the educational landscape, offering unique opportunities for learning and assessment. In the realm of written assessment grading, traditionally viewed as a laborious and subjective process, this study sought to evaluate the accuracy and reliability of these LLMs in evaluating the achievement of learning outcomes across different cognitive domains in a scientific inquiry course on sports physiology. Human graders and three LLMs, GPT-3.5, GPT-4o, and Gemini, were tasked with scoring submitted student assignments according to a set of rubrics aligned with various cognitive domains, namely "Understand," "Analyze," and "Evaluate" from the revised Bloom's taxonomy and "Scientific Inquiry Competency." Our findings revealed that while LLMs demonstrated some level of competency, they do not yet meet the assessment standards of human graders. Specifically, interrater reliability (percentage agreement and correlation analysis) between human graders was superior as compared to between two grading rounds for each LLM, respectively. Furthermore, concordance and correlation between human and LLM graders were mostly moderate to poor in terms of overall scores and across the pre-specified cognitive domains. The results suggest a future where AI could complement human expertise in educational assessment but underscore the importance of adaptive learning by educators and continuous improvement in current AI technologies to fully realize this potential.NEW & NOTEWORTHY The advent of large language models (LLMs) such as ChatGPT and Gemini has offered new learning and assessment opportunities to integrate artificial intelligence (AI) with education. This study evaluated the accuracy of LLMs in assessing an assignment from a course on sports physiology. Concordance and correlation between human graders and LLMs were mostly moderate to poor. The findings suggest AI's potential to complement human expertise in educational assessment alongside the need for adaptive learning by educators.

  • Research Article
  • Cite Count Icon 4
  • 10.3897/jucs.134739
An Empirical Evaluation of Large Language Models in Static Code Analysis for PHP Vulnerability Detection
  • Sep 14, 2024
  • JUCS - Journal of Universal Computer Science
  • Orçun Çetin + 3 more

Web services play an important role in our daily lives. They are used in a wide range of activities, from online banking and shopping to education, entertainment and social interactions. Therefore, it is essential to ensure that they are kept as secure as possible. However – as is the case with any complex software system – creating a sophisticated software free from any security vulnerabilities is a very challenging task. One method to enhance software security is by employing static code analysis. This technique can be used to identify potential vulnerabilities in the source code before they are exploited by bad actors. This approach has been instrumental in tackling many vulnerabilities, but it is not without limitations. Recent research suggests that static code analysis can benefit from the use of large language models (LLMs). This is a promising line of research, but there are still very few and quite limited studies in the literature on the effectiveness of various LLMs at detecting vulnerabilities in source code. This is the research gap that we aim to address in this work. Our study examined five notable LLM chatbot models: ChatGPT 4, ChatGPT 3.5, Claude, Bard/Gemini1, and Llama-2, assessing their abilities to identify 104 known vulnerabilities spanning the Top-10 categories defined by the Open Worldwide Application Security Project (OWASP). Moreover, we evaluated issues related to these LLMs’ false-positive rates using 97 patched code samples. We specifically focused on PHP vulnerabilities, given its prevalence in web applications. We found that ChatGPT-4 has the highest vulnerability detection rate, with over 61.5% of vulnerabilities found, followed by ChatGPT-3.5 at 50%. Bard has the highest rate of vulnerabilities missed, at 53.8%, and the lowest detection rate, at 13.4%. For all models, there is a significant percentage of vulnerabilities that were classified as partially found, indicating a level of uncertainty or incomplete detection across all tested LLMs. Moreover, we found that ChatGPT-4 and ChatGPT-3.5 are consistently more effective across most categories, compared to other models. Bard and Llama-2 display limited effectiveness in detecting vulnerabilities across the majority of categories listed. Surprisingly, our findings reveal high false positive rates across all LLMs. Even the model demonstrating the best performance (ChatGPT-4) notched a false positive rate of nearly 63%, while several models glaringly under-performed, hitting startlingly bad false positive rates of over 90%. Finally, simultaneously deploying multiple LLMs for static analysis resulted in only a marginal enhancement in the rates of vulnerability detection. We believe these results are generalizable to most other programming languages, and hence far from being limited to PHP only.

  • Research Article
  • 10.1155/ijta/8034289
Integrating Large Language Models Into UAE Community Pharmacies: Pharmacists′ Perspectives on Benefits, Concerns, and Implementation Barriers
  • Jan 1, 2026
  • International Journal of Telemedicine and Applications
  • Anan S Jarab + 5 more

BackgroundThe UAE′s rapid economic growth and adoption of advanced healthcare technologies necessitate understanding pharmacists′ perspectives on large language models (LLMs) to address implementation challenges and align with the nation′s digital health initiatives.AimThis study explored UAE pharmacists′ perceived benefits, concerns, and barriers to LLM adoption, as well as factors contributing to heightened concerns in community pharmacies.MethodsA survey‐based cross‐sectional study was conducted among 528 community pharmacists (51.3% female) in the UAE between October and November 2024. Pharmacists completed a validated questionnaire assessing socio‐demographic information, perceived benefits, concerns, and barriers related to LLM use. Binary logistic regression was applied to identify factors associated with concerns about LLMs.ResultsThe least‐perceived benefits of LLMs included providing around‐the‐clock support (37.3%), designing personalized care plans (74.4%), and improving patient outcomes (77.0%). Barriers included the need for human supervision (54.7%), insufficient training (32.4%), lack of pharmacy‐focused LLM programs (28.4%), and inadequate resources (28.4%). Key concerns were technical failures or downtime (97.5%), hacking vulnerabilities (97.2%) and limited capacity for empathy, cultural understanding, or ethical considerations in healthcare (95.6%). Increased age was significantly associated with greater concerns (OR = 1.124, p < 0.001). Conversely, pharmacists with master′s or doctoral degrees (OR = 0.483, p = 0.008) and those likely to use LLMs in the future (OR = 0.357, p < 0.001) expressed fewer concerns.ConclusionThe integration of LLMs into community pharmacy practice faces challenges, including hacking risks, security vulnerabilities, insufficient empathy, and technical failures. Targeted interventions such as enhanced training, robust security measures, and tailored LLM solutions are essential to address these barriers and support safe adoption in pharmacy settings.

  • Research Article
  • Cite Count Icon 12
  • 10.1109/tnse.2025.3590975
ROFED-LLM: Robust Federated Learning for Large Language Models in Adversarial Wireless Environments
  • Jan 1, 2026
  • IEEE Transactions on Network Science and Engineering
  • Haoyu Wang + 6 more

Large language models (LLMs) have made significant advances in the field of natural language processing (NLP). However, their centralized training approach faces challenges related to data privacy, communication efficiency, and robustness against adversarial attacks, particularly in wireless environments. With the gradual depletion of high-quality public data, there is an urgent need to leverage private data distributed across various parties. Although federated learning (FL) offers a privacy-preserving collaborative training paradigm, it struggles to meet the high computational demands of edge devices and remains vulnerable to adversarial attacks. This paper introduces <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">ROFED</small>-LLM, a novel framework for robust, privacy-preserving training of LLMs on decentralized private data over wireless networks. By integrating split federated learning, which partitions the model across devices to enhance privacy, with adaptive jamming defense mechanisms, <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">ROFED</small>-LLM enables collaborative LLM training without raw data sharing while ensuring resilience against wireless adversarial attacks. Our multi-modal defense strategy combines model-level protections, such as differential privacy and dynamic pruning, with communication-level safeguards, including adaptive beamforming which optimizes wireless signal transmission to mitigate interference, and resource allocation optimization. Extensive experiments across diverse NLP tasks demonstrate <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">ROFED</small>-LLM's superiority, achieving a 12.87% improvement in privacy preservation and 18.26% enhancement in jamming resilience compared to existing methods such as FedAvg and SCAFFOLD, with only a marginal 3.94% trade-off in model accuracy. Our code repository has been open sourced at <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://anonymous.4open.science/r/RoFed-LLM-54E1</uri>.

  • Conference Article
  • 10.70314/is.2024.chtm.5
Leveraging Federated Learning for Secure Transfer and Deployment of ML Models in Healthcare
  • Jan 1, 2024
  • Dodevski, Zlate + 2 more

Federated learning (FL) represents a pivotal advancement in applying Machine Learning (ML) in healthcare. It addresses the challenges of data privacy and security by facilitating model transferability across institutions. This paper explores the effective employment of FL to enhance the deployment of large language models (LLMs) in healthcare settings while maintaining stringent privacy standards. Through a detailed examination of the challenges in applying LLMs to the healthcare domain, including privacy, security, regulatory constraints, and training data quality, we present a federated learning architecture tailored for LLMs in healthcare. This architecture outlines the roles and responsibilities of participating entities, providing a framework for secure collaboration. We further analyze privacy-preserving techniques such as differential privacy and secure aggregation in the context of federated LLMs for healthcare, offering insights into their practical implementation. Our findings suggest that federated learning can significantly enhance the capabilities of LLMs in healthcare while preserving patient privacy. In addition, we also identify persistent challenges in areas such as computational and communicational efficiency, lack of benchmarks and tailored FL aggregation algorithms applied to LLMs, model performance, and ethical concerns in participant selection. By critically evaluating the proposed approach and highlighting its potential benefits and limitations in real-world healthcare settings, this work provides a foundation for future research in secure and privacy-preserving ML deployment in healthcare.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 35
  • 10.1162/opmi_a_00160
The Limitations of Large Language Models for Understanding Human Language and Cognition.
  • Aug 31, 2024
  • Open mind : discoveries in cognitive science
  • Christine Cuskley + 2 more

Researchers have recently argued that the capabilities of Large Language Models (LLMs) can provide new insights into longstanding debates about the role of learning and/or innateness in the development and evolution of human language. Here, we argue on two grounds that LLMs alone tell us very little about human language and cognition in terms of acquisition and evolution. First, any similarities between human language and the output of LLMs are purely functional. Borrowing the "four questions" framework from ethology, we argue that what LLMs do is superficially similar, but how they do it is not. In contrast to the rich multimodal data humans leverage in interactive language learning, LLMs rely on immersive exposure to vastly greater quantities of unimodal text data, with recent multimodal efforts built upon mappings between images and text. Second, turning to functional similarities between human language and LLM output, we show that human linguistic behavior is much broader. LLMs were designed to imitate the very specific behavior of human writing; while they do this impressively, the underlying mechanisms of these models limit their capacities for meaning and naturalistic interaction, and their potential for dealing with the diversity in human language. We conclude by emphasising that LLMs are not theories of language, but tools that may be used to study language, and that can only be effectively applied with specific hypotheses to motivate research.

  • Research Article
  • Cite Count Icon 7
  • 10.3390/bdcc9030050
On Continually Tracing Origins of LLM-Generated Text and Its Application in Detecting Cheating in Student Coursework
  • Feb 20, 2025
  • Big Data and Cognitive Computing
  • Quan Wang + 1 more

Large language models (LLMs) have demonstrated remarkable capabilities in text generation, which also raise numerous concerns about their potential misuse, especially in educational exercises and academic writing. Accurately identifying and tracing the origins of LLM-generated content is crucial for accountability and transparency, ensuring the responsible use of LLMs in educational and academic environments. Previous methods utilize binary classifiers to discriminate whether a piece of text was written by a human or generated by a specific LLM or employ multi-class classifiers to trace the source LLM from a fixed set. These methods, however, are restricted to one or several pre-specified LLMs and cannot generalize to new LLMs, which are continually emerging. This study formulates source LLM tracing in a class-incremental learning (CIL) fashion, where new LLMs continually emerge, and a model incrementally learns to identify new LLMs without forgetting old ones. A training-free continual learning method is further devised for the task, the idea of which is to continually extract prototypes for emerging LLMs, using a frozen encoder, and then to perform origin tracing via prototype matching after a delicate decorrelation process. For evaluation, two datasets are constructed, one in English and one in Chinese. These datasets simulate a scenario where six LLMs emerge over time and are used to generate student essays, and an LLM detector has to incrementally expand its recognition scope as new LLMs appear. Experimental results show that the proposed method achieves an average accuracy of 97.04% on the English dataset and 91.23% on the Chinese dataset. These results validate the feasibility of continual origin tracing of LLM-generated text and verify its effectiveness in detecting cheating in student coursework.

  • Research Article
  • Cite Count Icon 811
  • 10.1016/j.hcc.2024.100211
A survey on large language model (LLM) security and privacy: The Good, The Bad, and The Ugly
  • Mar 1, 2024
  • High-Confidence Computing
  • Yifan Yao + 5 more

A survey on large language model (LLM) security and privacy: The Good, The Bad, and The Ugly

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant