Unleashing the Power of Singular Values for Parameter-Efficient Fine-Tuning of Large Pre-Trained Models
Large pre-trained models (LPMs) have achieved remarkable success across natural language processing and computer vision tasks. However, fully fine-tuning these models for downstream adaptation incurs high memory costs, posing challenges in resource-constrained settings. Parameter-efficient fine-tuning (PEFT) methods, such as LoRA, alleviate this by updating only a small subset of parameters. Despite their efficiency, these methods typically employ random initialization for low-rank matrices, which can lead to slower and less stable convergence during gradient descent, as well as diminished generalizability due to suboptimal starting points. In this paper, we present PiVot, a novel PEFT method that utilizes singular value decomposition (SVD) to initialize low-rank matrices, with critical singular values serving as trainable parameters. Specifically, PiVot performs SVD on the pre-trained weight matrix to obtain the best rank-<inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$r$</tex-math></inline-formula> approximation, focusing on the top-<inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$r$</tex-math></inline-formula> singular values that capture over 99% of the matrix's structural information. By treating these top-<inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$r$</tex-math></inline-formula> singular values as trainable parameters, PiVot effectively scales the fundamental subspaces of the pre-trained weight matrix, enabling efficient and targeted adaptation to new domains. Extensive experiments across various LPMs demonstrate that PiVot achieves superior performance compared to LoRA on tasks such as natural language understanding, text-to-image generation, and image classification while requiring 16 times fewer trainable parameters.
- Research Article
2
- 10.24193/subbi.2023.1.04
- Jul 20, 2023
- Studia Universitatis Babeș-Bolyai Informatica
"The Medical Visual Question Answering problem is a joined Computer Vision and Natural Language Processing task that aims to obtain answers in natural language to a question, posed in natural language as well, regarding an image. Both the image and question are of a medical nature. In this paper we introduce DOMAS, a deep learning model that solves this task on the Med-VQA 2019 dataset. The method is based on dividing the task into smaller classification problems by using a BERT-based question classification and a unique approach that makes use of dataset information for selecting the suited model. For the image classification problems, transfer learning using a pre-trained Swin Transform based architecture is used. DOMAS uses a question classifier and seven image classifiers along with the image classifier selection strategy and achieves 0.616 strict accuracy and 0.654 BLUE score. The results are competitive with other state-of-the-art models, proving that our approach is effective in solving the presented task. 2010 Mathematics Subject Classification. 68T45, 68T50. 1998 CR Categories and Descriptors. I.2.7 [Artificial Intelligence]: Natural Language Processing – Language parsing and understanding; I.2.7 [Artificial Intelligence]: Applications and Expert Systems – Medicine and science. Key words and phrases. Medical Visual Question Answering, Swin Transformer."
- Conference Article
3
- 10.1109/cecit53797.2021.00028
- Dec 1, 2021
Natural Language Understanding (NLU) aims to make sense of language by enabling computers to comprehend text in semantic level, which is a fundamental but challenging task in natural language processing. Recently, BERT has achieved state-of-the-art performances in NLU utilizing pretraining and fine-tuning techniques to capture the task-specific information without any task-specific structure. However, the existing models still cannot capture context-specific information, leading to poor performance when dealing with the pervasive ambiguity of language caused by the polysemous words. In this paper, we propose a Parameter-Adaptive Convolution Neural Network (PACNN) to capture the context-specific information which can deal with the polysemous word better. Instead of the convolution layers in existing models, the parameters of convolutional filters in PACNN, generated by a deconvolution (e.g. convolution transpose) neural network, are adaptable according to the input sentences, which can filter the local information via the global information and capture the context-specific meaning of the polysemous words. We empirically demonstrate the efficiency of the proposed PACNN by performing a series of experiments on the General Language Understanding Evaluation (GLUE) benchmark, a collection of popular datasets on different tasks, and the PACNN significantly outperforms all the baselines. Besides, our model can be appended to BERT to further improve its performances. As BERT and the proposed PACNN capture information from various aspects, the proposed BERT+PACNN achieves the best performances compared with BERT and other baselines. Furthermore, we visualize the task-specific information and context-specific information captured by BERT and the PACNN, separately, using Singular Value Decomposition (SVD) to demonstrate the efficiencies of the two models further.
- Research Article
21
- 10.1162/coli_a_00420
- Dec 7, 2021
- Computational Linguistics
Natural Language Processing and Computational Linguistics
- Research Article
123
- 10.1145/3593042
- Jul 17, 2023
- ACM Computing Surveys
In the past few years, it has become increasingly evident that deep neural networks are not resilient enough to withstand adversarial perturbations in input data, leaving them vulnerable to attack. Various authors have proposed strong adversarial attacks for computer vision and Natural Language Processing (NLP) tasks. As a response, many defense mechanisms have also been proposed to prevent these networks from failing. The significance of defending neural networks against adversarial attacks lies in ensuring that the model’s predictions remain unchanged even if the input data is perturbed. Several methods for adversarial defense in NLP have been proposed, catering to different NLP tasks such as text classification, named entity recognition, and natural language inference. Some of these methods not only defend neural networks against adversarial attacks but also act as a regularization mechanism during training, saving the model from overfitting. This survey aims to review the various methods proposed for adversarial defenses in NLP over the past few years by introducing a novel taxonomy. The survey also highlights the fragility of advanced deep neural networks in NLP and the challenges involved in defending them.
- Research Article
- 10.51583/ijltemas.2025.1408000109
- Sep 12, 2025
- International Journal of Latest Technology in Engineering Management & Applied Science
Abstract: The rapid expansion of the AI revolution has been propelled by a focus on large-scale pretrained models, which have enabled significant advancements across diverse tasks in computer vision, multimodal applications, and natural language processing. This swift progress has simultaneously heightened concerns regarding data privacy and protection, particularly with the introduction of more stringent legislative measures like the California Consumer Privacy Act (CCPA) and the General Data Protection Regulation (GDPR). To address these challenges, the concept of "unlearning" is crucial. Unlearning refers to the technological process of eliminating specific data or its influence from a trained model, typically when necessitated by data deletion rights or ethical considerations. Unlike simply removing entries from a database, the complex and interconnected nature of learned representations in deep neural networks makes the process of unlearning within AI systems considerably more difficult. This study thoroughly investigates AI unlearning methods and structures for data erasure in trained models, operating within established ethical and legal boundaries. The inquiry begins by discussing the moral and legal justifications for machine unlearning, emphasizing factors such as model functionality, data traceability, and the completeness of the deletion process. Next, i present a classification of existing unlearning techniques, ranging from those less suitable for handling large-scale pretrained models and diverse data types to those better adapted for real-world applications. This category includes techniques such as retraining, model modification, knowledge distillation, approximation unlearning, and certified removal. Following an assessment of unlearning approaches for large pretrained models and varied data modalities, the discussion expands into a detailed examination of their benefits, drawbacks, computational costs, and trade-offs. This includes a focus on concepts like 'influence' (data's impact) and 'deletion' (successful removal). I formalize machine unlearning and establish its theoretical foundation. In my experience, unlearning can be effectively implemented in various contexts, particularly with pretrained models, to minimize accuracy loss while ensuring robust privacy assurances. This capability is enabled by specific methodological frameworks and algorithms. My experimental assessment compares various unlearning methods across a range of datasets and tasks, paying particular attention to the 'remembering' metric, model utility preservation, computational cost, and resilience to data reconstruction attacks. Furthermore, the study integrates technical and regulatory domains by connecting legal requirements to quantifiable machine learning goals and by illuminating moral dilemmas that seek to balance privacy with openness and justice. I clearly highlight significant inconsistencies between current legal requirements and the actual technical potential of unlearning, offering theoretical and technological guidance through multidisciplinary approaches. Despite these achievements, I found that scalable and verifiable unlearning in large pretrained models remains a nascent yet crucial field of study. To ensure adherence to privacy regulations and uphold ethical standards in AI applications, this study lays the groundwork for future research into unified standards, rigorous evaluation processes, and practical unlearning technology deployment. The overarching goal is to foster the sustained development of trustworthy AI systems that uphold personal data rights while simultaneously delivering genuine value and goodwill to society.
- Research Article
51
- 10.1093/jamia/ocae074
- Apr 24, 2024
- Journal of the American Medical Informatics Association : JAMIA
Generative large language models (LLMs) are a subset of transformers-based neural network architecture models. LLMs have successfully leveraged a combination of an increased number of parameters, improvements in computational efficiency, and large pre-training datasets to perform a wide spectrum of natural language processing (NLP) tasks. Using a few examples (few-shot) or no examples (zero-shot) for prompt-tuning has enabled LLMs to achieve state-of-the-art performance in a broad range of NLP applications. This article by the American Medical Informatics Association (AMIA) NLP Working Group characterizes the opportunities, challenges, and best practices for our community to leverage and advance the integration of LLMs in downstream NLP applications effectively. This can be accomplished through a variety of approaches, including augmented prompting, instruction prompt tuning, and reinforcement learning from human feedback (RLHF). Our focus is on making LLMs accessible to the broader biomedical informatics community, including clinicians and researchers who may be unfamiliar with NLP. Additionally, NLP practitioners may gain insight from the described best practices. We focus on 3 broad categories of NLP tasks, namely natural language understanding, natural language inferencing, and natural language generation. We review the emerging trends in prompt tuning, instruction fine-tuning, and evaluation metrics used for LLMs while drawing attention to several issues that impact biomedical NLP applications, including falsehoods in generated text (confabulation/hallucinations), toxicity, and dataset contamination leading to overfitting. We also review potential approaches to address some of these current challenges in LLMs, such as chain of thought prompting, and the phenomena of emergent capabilities observed in LLMs that can be leveraged to address complex NLP challenge in biomedical applications.
- Dissertation
3
- 10.32657/10356/168430
- Jan 1, 2023
The capability for machines to transduce, understand, and reason with natural language lives at the heart of Artificial Intelligence not only because natural language is one of the main mediums for information delivery, residing in documents, daily chats, and databases of various languages, but also because it involves many key aspects of intelligence (e.g., logic, understanding, abstraction, etc.). Empowering the machine with more linguistic intelligence may benefit a wide range of real-world applications such as Machine Translation, Natural Language Understanding, Dialogue Systems, etc. At present, there are two popular streams of approaches for building intelligent Natural Language Processing (NLP) systems, i.e., sub-symbolic and neural-symbolic approaches. Sub-symbolic approaches learn implicit representations on the corpus that is unstructured, which is massive in amount but results in poor interpretability and reasoning ability of the learned models; neural-symbolic approaches integrate neural and symbolic architectures to incorporate structured symbolic data (e.g., semantic nets, knowledge graphs, etc.) as an external knowledge source, which makes the learned model more interpretable and logical, but the structured symbolic data is hard to be fully represented and it is comparatively scarce. As a result, both streams of approaches deserve studying, since they have their respective strengths and weaknesses, working complementarily in different tasks/scenarios. Meanwhile, attention-based models, such as Transformers, have achieved huge success in many NLP tasks such as Machine Translation, Language Modeling, Question Answering, etc. However, the attention itself has many issues, such as redundancy, quadratic complexity, weak inductive bias, etc. Besides, the previous applications of attention-based models in various NLP tasks are problematic, e.g., omitting the prior attention distribution, large computation complexity, weak long-term reasoning capability, etc. To this end, this thesis explores novel attention architectures for NLP tasks that are currently based mainly on sub-symbolic or neural-symbolic approaches to solve the existing issues and advance the state-of-the-art. In particular, for sub-symbolic-based tasks, we study Machine Translation, Language Modeling, Abstractive Summarization, and Spoken Language Understanding; for neural-symbolic-based tasks, we study Dialogue Commonsense Reasoning. The following lists the main contributions of this thesis: We study the redundancy and over-parameterization issues of Multi-Head Attention (MHA). We find that, in a certain range, higher compactness of attention heads (i.e., the intra-group heads become closer to each other and the inter-group ones become farther) improves the performance of MHA, which forces the MHA to focus on the most representative and distinctive features, providing guidance for future architectural designs. Accordingly, we propose a divide-and-conquer strategy that consists of Group-Constrained Training (GCT) and Voting to Stay (V2S). It mitigates the redundancy and over-parameterization issues of MHA. Our method uses fewer parameters and achieves better performance, outperforming the existing MHA redundancy/parameter reduction methods. We verify our methods on three well-established NLP tasks (i.e., Machine Translation, Language Modeling, and Abstractive Summarization). The superior results on datasets with multiple languages, domains, and data sizes demonstrate the effectiveness of our method. We ease the modality and granularity inconsistency problem when distilling knowledge from the teacher understanding model to the student ones, by refining the attention hidden states based on the attention map distribution. We propose to apply the Attention-based Significance Priors (ASP) to improve the semantic knowledge transfer from text to speech. We further propose the Anchor-based Adaptive Span Aggregation algorithm (AASA) that narrows the modal granularity gap of alignments. To the best of our knowledge, we are the first that evaluate multiple different alignment strategies beyond vanilla global and local alignments to study the feasibility of metric-based speech-text distillations. The results on three spoken language understanding benchmarks (i.e., Intent Detection, Slot Filling, and Emotion Recognition) verify our assumptions and claims. We improve the multi-source and long-term Dialogue Commonsense Reasoning (DCR) process, which is a new and difficult problem in NLP, by presenting a hierarchical attention-based decoding block. We propose the first Transformer-based KG walker that attentively reads multiscale inputs for graph decoding. Specifically, Multi-source Decoding Inputs (MDI) and Output-level Length Head (OLH) are presented to strengthen the controllability and multi-hop reasoning ability of the Hierarchical Attention-based Graph Decoder (HAGD). We further propose a two-hierarchy learning framework to train the proposed hierarchical attention-based KG walker, in order to learn both turn-level and global-level KG entities as conversation topics. This is the first attempt to learn models to make natural transitions towards the global topic in KG, where we present a distance embedding to incorporate distance information. Moreover, we propose MetaPath (MP) to concurrently exploit entity and relation information when reasoning, which is proved essential as the backbone method for KG path representation, providing a paradigm for KG reasoning. The results on the DCR dataset OpendialKG show that HiTKG achieves a significant improvement in the performance of turn-level reasoning compared with state-of-the-art baselines. Additionally, both automatic and human evaluation prove the effectiveness of the two-hierarchy learning framework for both short-term and long-term DCR.
- Research Article
1
- 10.47059/revistageintec.v11i2.1769
- Jun 5, 2021
- Revista Gestão Inovação e Tecnologias
The factorization of a matrix into lower rank matrices give solutions to a wide range of computer vision and image processing tasks. The inherent patches or the atomic patches can completely describe the whole image. The lower rank matrices are obtained using different tools including Singular Value Decomposition (SVD), which is typically found in minimization problems of nuclear norms. The singular values obtained will generally be a thresholder to realize the nuclear norm minimization. However, soft-thresholding is performed uniformly on all the singular values that lead to a similar importance to all the patches whether it is principal/useful or not. Our observation is that the decision on a patch (to be principal/useful or not) can be taken only when the application of this minimization is taken into consideration. Thus, in this paper, we propose a new method for image denoising by choosing variable weights to different singular values with a deep noise effect. Experimental results illustrate that the proposed weighted scheme performs better than the state-of-the-art methods.
- Research Article
1
- 10.17697/ibmrd/2014/v3i2/51969
- Sep 1, 2014
- IBMRD s Journal of Management & Research
The NLP is closer for interfacing among the peoples knowing different languages. If we consider an example of India there are various peoples talking in various languages. Huge literature is available in different local languages which is not understandable to others in India itself. So we can use Information technology for Natural Language Processing. Natural language processing (NLP) is a field of computer science and linguistics concerned with the interactions between computers and human (natural) languages; it began as a branch of artificial intelligence. In theory, natural language processing is a very attractive method of human-computer interaction. Natural language understanding is sometimes referred to as an AI-complete problem because it seems to require extensive knowledge about the outside world and the ability to manipulate it. Modern NLP algorithms are grounded in machine learning, especially statistical machine learning. Research into modern statistical NLP algorithms requires an understanding of a number of disparate fields, including linguistics, computer science, and statistics. In this paper we want to study on Role of NLP for Indian Language conversions, like Marathi to Hindi, Hindi to Gujarati etc. If we observe the different languages in India they look similar in different aspects like Grammar, Words, and Alphabets. This paper will discuss the solutions available, problems and challenges in Indian Language conversions.
- Conference Article
7
- 10.2991/meita-15.2015.126
- Jan 1, 2015
Aiming at the problem of image noise level estimation, this paper proposes an algorithm for noise estimation by singular value decomposition and neural network. The larger (head) parts of the singular values of an image are mainly affected by main structure of the image, and the rest (tail) parts of the singular values are affected by the intensity of noise. With the increase of noise level, corresponding tail parts of singular values are increased. So, singular values should be good characteristics for noise intensity estimation. Firstly, we add different noise with known intensity on a batch of noise free images, and then select a certain number of fixed size image blocks which standard deviation are minimum from these noisy images. Then singular values of these blocks were fed as the input of the neural network, their corresponding noise standard deviation as the output to train neural network. Finally, in the estimation phase, singular values of noise image were used fed into the trained network to predict the unknown noise intensity. The experimental results show that proposed algorithm is quite promising. It can estimates different types of noise with fast speed and high precise, including Gauss white noise and Hybrid noise.
- Research Article
16
- 10.1111/exsy.13712
- Aug 25, 2024
- Expert Systems
Conversational assistants (CAs) and Task‐oriented ones, in particular, are designed to interact with users in a natural language manner, assisting them in completing specific tasks or providing relevant information. These systems employ advanced natural language understanding (NLU) and dialogue management techniques to comprehend user inputs, infer their intentions, and generate appropriate responses or actions. Over time, the CAs have gradually diversified to today touch various fields such as e‐commerce, healthcare, tourism, fashion, travel, and many other sectors. NLU is fundamental in the natural language processing (NLP) field. Identifying user intents from natural language utterances is a sub‐task of NLU that is crucial for conversational systems. The diversity in user utterances makes intent detection (ID) even a challenging problem. Recently, with the emergence of Deep Neural Networks. New State of the Art (SOA) results have been achieved for different NLP tasks. Recurrent neural networks (RNNs) and Transformer architectures are two major players in those improvements. RNNs have significantly contributed to sequence modelling across various application areas. Conversely, Transformer models represent a newer architecture leveraging attention mechanisms, extensive training data sets, and computational power. This review paper begins with a detailed exploration of RNN and Transformer models. Subsequently, it conducts a comparative analysis of their performance in intent recognition for Task‐oriented (CAs). Finally, it concludes by addressing the main challenges and outlining future research directions.
- Dissertation
- 10.32657/10356/182513
- Jan 1, 2025
Natural Language Processing (NLP) empowers computers to process and analyze vast amounts of text data. The introduction of pre-trained language models (PLMs) has significantly advanced NLP by incorporating deep learning algorithms, thereby enhancing the handling of natural language understanding (NLU) tasks. However, due to their universal design, PLMs might not perform optimally in specialized tasks if essential features are not included during initial training. As a result, several training paradigms have been developed to enhance the downstream performance of PLMs, as outlined below. Promptless fine-tuning, a common training method, adapts PLMs to specific tasks by modifying model parameters using task-specific training data. This approach has proven effective across different low-resource deep-learning models. Nevertheless, fine-tuning may face challenges such as overfitting or lack of robustness under conditions of training data scarcity. To mitigate this, the prompt-based learning paradigm is introduced, utilizing natural language prompts to enhance model understanding. Within this paradigm, fixed-prompt LM tuning wraps input sentences into a template, allowing the PLM to engage in tasks like masked language prediction or inference understanding. These techniques, which incorporate task descriptions, have shown substantial efficacy in few-shot learning. However, the need for highly comprehensive and generalizable templates presents challenges in design, and the issues of promptless fine-tuning persist, albeit mitigated, in low-data settings. With the rise of large language models (LLMs), traditional fine-tuning has become prohibitive for many users due to computational demands. As a result, tuning-free prompting has emerged as a novel approach for those with limited computational resources, leveraging the inherent language understanding capabilities of LLMs. This paradigm, including in-context learning (ICL), relies on few-shot instruction and demonstration to prompt LLM responses. The effectiveness of ICL heavily depends on how sample-label pairs are organized within demonstrations. The strategic selection and ranking of these pairs to maximize understanding with minimal data is a critical area of ongoing research in NLP applications. External knowledge is proven to be beneficial for deep learning algorithms to reduce the reliance on training data and provide additional useful information. How to effectively incorporate useful external knowledge to excite LM's capabilities in low-resource NLP applications remains an open research question. This thesis investigates how incorporating existing knowledge into training paradigms can enhance NLP applications under low-resource conditions such as training data scarcity and limited computational resources. By leveraging external knowledge as prior knowledge, we aim to achieve improved text representations, more nuanced task descriptions, and richer label information instructions, thereby reducing the model's dependency on training data and enhancing its understanding capabilities. We demonstrate the advantages of integrating additional knowledge into deep learning systems and offer frameworks to apply this knowledge across different training paradigms, thereby improving performance on various NLP tasks, particularly under low-resource conditions. Specifically, 1. In the promptless fine-tuning paradigm, we first focus on fine-tuning word embeddings for task-related words, thereby enriching the conceptual knowledge available to compositional neural networks during feature learning in emotion recognition. This approach effectively enhances emotional keyword attention. We then extend this method by incorporating domain-specific lexical knowledge to improve the pre-trained word representations within a learning network, enriching the context-based word embeddings with discriminative features, providing more semantic insights, and bolstering performance across various classification tasks. 2. In the fixed-prompt LM tuning paradigm, we introduce a novel task description that incorporates dictionary knowledge to offer extensive semantic insights into labels. Building on this strategy, we devise an approach to augment few-shot classification performance within an entailment-based framework, significantly enhancing the efficiency of using limited training data and even facilitating zero-shot learning. 3. In the tuning-free prompting paradigm, we demonstrate how to incorporate label-related words into demonstrations based on LLM feedback, creating effective sample-and-label-level demonstrations. Additionally, we propose an innovative method that uses multiple-label words in demonstrations instead of traditional class names, offering more detailed and varied label instructions for LM understanding, thereby improving in-context learning (ICL) classification capabilities.
- Research Article
10
- 10.1109/tpami.2023.3236725
- Jul 1, 2023
- IEEE Transactions on Pattern Analysis and Machine Intelligence
Attention-based neural networks, such as Transformers, have become ubiquitous in numerous applications, including computer vision, natural language processing, and time-series analysis. In all kinds of attention networks, the attention maps are crucial as they encode semantic dependencies between input tokens. However, most existing attention networks perform modeling or reasoning based on representations, wherein the attention maps of different layers are learned separately without explicit interactions. In this paper, we propose a novel and generic evolving attention mechanism, which directly models the evolution of inter-token relationships through a chain of residual convolutional modules. The major motivations are twofold. On the one hand, the attention maps in different layers share transferable knowledge, thus adding a residual connection can facilitate the information flow of inter-token relationships across layers. On the other hand, there is naturally an evolutionary trend among attention maps at different abstraction levels, so it is beneficial to exploit a dedicated convolution-based module to capture this process. Equipped with the proposed mechanism, the convolution-enhanced evolving attention networks achieve superior performance in various applications, including time-series representation, natural language understanding, machine translation, and image classification. Especially on time-series representation tasks, Evolving Attention-enhanced Dilated Convolutional (EA-DC-) Transformer outperforms state-of-the-art models significantly, achieving an average of 17% improvement compared to the best SOTA. To the best of our knowledge, this is the first work that explicitly models the layer-wise evolution of attention maps. Our implementation is available at https://github.com/pkuyym/EvolvingAttention.
- Research Article
388
- 10.1001/jama.2024.21700
- Oct 15, 2024
- JAMA
Large language models (LLMs) can assist in various health care activities, but current evaluation approaches may not adequately identify the most useful application areas. To summarize existing evaluations of LLMs in health care in terms of 5 components: (1) evaluation data type, (2) health care task, (3) natural language processing (NLP) and natural language understanding (NLU) tasks, (4) dimension of evaluation, and (5) medical specialty. A systematic search of PubMed and Web of Science was performed for studies published between January 1, 2022, and February 19, 2024. Studies evaluating 1 or more LLMs in health care. Three independent reviewers categorized studies via keyword searches based on the data used, the health care tasks, the NLP and NLU tasks, the dimensions of evaluation, and the medical specialty. Of 519 studies reviewed, published between January 1, 2022, and February 19, 2024, only 5% used real patient care data for LLM evaluation. The most common health care tasks were assessing medical knowledge such as answering medical licensing examination questions (44.5%) and making diagnoses (19.5%). Administrative tasks such as assigning billing codes (0.2%) and writing prescriptions (0.2%) were less studied. For NLP and NLU tasks, most studies focused on question answering (84.2%), while tasks such as summarization (8.9%) and conversational dialogue (3.3%) were infrequent. Almost all studies (95.4%) used accuracy as the primary dimension of evaluation; fairness, bias, and toxicity (15.8%), deployment considerations (4.6%), and calibration and uncertainty (1.2%) were infrequently measured. Finally, in terms of medical specialty area, most studies were in generic health care applications (25.6%), internal medicine (16.4%), surgery (11.4%), and ophthalmology (6.9%), with nuclear medicine (0.6%), physical medicine (0.4%), and medical genetics (0.2%) being the least represented. Existing evaluations of LLMs mostly focus on accuracy of question answering for medical examinations, without consideration of real patient care data. Dimensions such as fairness, bias, and toxicity and deployment considerations received limited attention. Future evaluations should adopt standardized applications and metrics, use clinical data, and broaden focus to include a wider range of tasks and specialties.
- Research Article
19
- 10.22059/jitm.2019.289271.2402
- Jun 1, 2019
- SHILAP Revista de lepidopterología
Recommender systems are important tools for users to identify their preferred items and for businesses to improve their products and services. In recent years, the use of online services for selection and reservation of hotels have witnessed a booming growth. Customer’ reviews have replaced the word of mouth marketing, but searching hotels based on user priorities is more time-consuming. This study is aimed at designing a recommender system based on the explicit and implicit preferences of the customers in order to increase prediction’s accuracy. In this study, we have combined sentiment analysis with the Collaborative Filtering (CF) based on deep learning for user groups in order to increase system accuracy. The proposed system uses Natural Language Processing (NLP) and supervised classification approach to analyze sentiments and extract implicit features. In order to design the recommender system, the Singular Value Decomposition (SVD) was used to improve scalability. The results show that our proposed method improves CF performance.