Language Processing Tasks Research Articles

With the rapid development of online services and web applications, recommender systems (RS) have become increasingly indispensable for mitigating information overload and matching users’ information needs by providing personalized suggestions over items. Although the RS research community has made remarkable progress over the past decades, conventional recommendation models (CRM) still have some limitations, e.g. , lacking open-domain world knowledge, and difficulties in comprehending users’ underlying preferences and motivations. Meanwhile, large language models (LLM) have shown impressive general intelligence and human-like capabilities for various natural language processing (NLP) tasks, which mainly stem from their extensive open-world knowledge, logical and commonsense reasoning abilities, as well as their comprehension of human culture and society. Consequently, the emergence of LLM is inspiring the design of recommender systems and pointing out a promising research direction, i.e. , whether we can incorporate LLM and benefit from their common knowledge and capabilities to compensate for the limitations of CRM. In this paper, we conduct a comprehensive survey on this research direction, and draw a bird’s-eye view from the perspective of the whole pipeline in real-world recommender systems. Specifically, we summarize existing research works from two orthogonal aspects: where and how to adapt LLM to RS. For the “ WHERE ” question, we discuss the roles that LLM could play in different stages of the recommendation pipeline, i.e. , feature engineering, feature encoder, scoring/ranking function, user interaction, and pipeline controller. For the “ HOW ” question, we investigate the training and inference strategies, resulting in two fine-grained taxonomy criteria, i.e. , whether to tune LLM or not during training, and whether to involve conventional recommendation models for inference. Detailed analysis and general development paths are provided for both “WHERE” and “HOW” questions, respectively. Then, we highlight the key challenges in adapting LLM to RS from three aspects, i.e. , efficiency, effectiveness, and ethics. Finally, we summarize the survey and discuss the future prospects. To further facilitate the research community of LLM-enhanced recommender systems, we actively maintain a GitHub repository for papers and other related resources in this rising direction 1 .

Large language models (LLMs) have achieved great progress in natural language processing tasks and demonstrated the potential for use in clinical applications. Despite their capabilities, LLMs in the medical domain are prone to generating hallucinations (not fully reliable responses). Hallucinations in LLMs' responses create substantial risks, potentially threatening patients' physical safety. Thus, to perceive and prevent this safety risk, it is essential to evaluate LLMs in the medical domain and build a systematic evaluation. We developed a comprehensive evaluation system, MedGPTEval, composed of criteria, medical data sets in Chinese, and publicly available benchmarks. First, a set of evaluation criteria was designed based on a comprehensive literature review. Second, existing candidate criteria were optimized by using a Delphi method with 5 experts in medicine and engineering. Third, 3 clinical experts designed medical data sets to interact with LLMs. Finally, benchmarking experiments were conducted on the data sets. The responses generated by chatbots based on LLMs were recorded for blind evaluations by 5 licensed medical experts. The evaluation criteria that were obtained covered medical professional capabilities, social comprehensive capabilities, contextual capabilities, and computational robustness, with 16 detailed indicators. The medical data sets include 27 medical dialogues and 7 case reports in Chinese. Three chatbots were evaluated: ChatGPT by OpenAI; ERNIE Bot by Baidu, Inc; and Doctor PuJiang (Dr PJ) by Shanghai Artificial Intelligence Laboratory. Dr PJ outperformed ChatGPT and ERNIE Bot in the multiple-turn medical dialogues and case report scenarios. Dr PJ also outperformed ChatGPT in the semantic consistency rate and complete error rate category, indicating better robustness. However, Dr PJ had slightly lower scores in medical professional capabilities compared with ChatGPT in the multiple-turn dialogue scenario. MedGPTEval provides comprehensive criteria to evaluate chatbots by LLMs in the medical domain, open-source data sets, and benchmarks assessing 3 LLMs. Experimental results demonstrate that Dr PJ outperforms ChatGPT and ERNIE Bot in social and professional contexts. Therefore, such an assessment system can be easily adopted by researchers in this community to augment an open-source data set.

Language Processing Tasks Research Articles

Related Topics

Articles published on Language Processing Tasks

MicroBERT: Distilling MoE-Based Knowledge from BERT into a Lighter Model

Simple Data Transformations for Mitigating the Syntactic Similarity to Improve Sentence Embeddings at Supervised Contrastive Learning

Improving automatic cyberbullying detection in social network environments by fine-tuning a pre-trained sentence transformer language model

TransLSTM: A hybrid LSTM-Transformer model for fine-grained suggestion mining

How Can Recommender Systems Benefit from Large Language Models: A Survey

From large language models to small logic programs: building global explanations from disagreeing local post-hoc explainers

You only compress once: Towards effective and elastic BERT compression via exploit–explore stochastic nature gradient

Transformers meets neoantigen detection: a systematic literature review.

Is Boundary Annotation Necessary? Evaluating Boundary-Free Approaches to Improve Clinical Named Entity Annotation Efficiency: Case Study.

Incorporating target-aware knowledge into prompt-tuning for few-shot stance detection

Cloud-based machine learning algorithms for anomalies detection

Machine Transliteration of Handwritten MODI Script to Devanagari using Deep Neural Networks

Let's Speak Trajectories: A Vision to Use NLP Models for Trajectory Analysis Tasks

SPECE: Subject Position Encoder in Complex Embedding for Relation Extraction

Author Profiling Approach: Predicting Personality Traits on Twitter Data using Combined BERT and SimCSE Embeddings

Advancements in Deep Learning Architectures for Natural Language Processing Tasks

A BERT-GRU Model for Measuring the Similarity of Arabic Text

Fine-tuning BERT, DistilBERT, XLM-RoBERTa and Ukr-RoBERTa models for sentiment analysis of ukrainian language reviews

Data Set and Benchmark (MedGPTEval) to Evaluate Responses From Large Language Models in Medicine: Evaluation Development and Validation.

Domain-Specific Few-Shot Table Prompt Question Answering via Contrastive Exemplar Selection

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Language Processing Tasks Research Articles

Related Topics

Articles published on Language Processing Tasks

MicroBERT: Distilling MoE-Based Knowledge from BERT into a Lighter Model

Simple Data Transformations for Mitigating the Syntactic Similarity to Improve Sentence Embeddings at Supervised Contrastive Learning

Improving automatic cyberbullying detection in social network environments by fine-tuning a pre-trained sentence transformer language model

TransLSTM: A hybrid LSTM-Transformer model for fine-grained suggestion mining

How Can Recommender Systems Benefit from Large Language Models: A Survey

From large language models to small logic programs: building global explanations from disagreeing local post-hoc explainers

You only compress once: Towards effective and elastic BERT compression via exploit–explore stochastic nature gradient

Transformers meets neoantigen detection: a systematic literature review.

Is Boundary Annotation Necessary? Evaluating Boundary-Free Approaches to Improve Clinical Named Entity Annotation Efficiency: Case Study.

Incorporating target-aware knowledge into prompt-tuning for few-shot stance detection

Cloud-based machine learning algorithms for anomalies detection

Machine Transliteration of Handwritten MODI Script to Devanagari using Deep Neural Networks

Let's Speak Trajectories: A Vision to Use NLP Models for Trajectory Analysis Tasks

SPECE: Subject Position Encoder in Complex Embedding for Relation Extraction

Author Profiling Approach: Predicting Personality Traits on Twitter Data using Combined BERT and SimCSE Embeddings

Advancements in Deep Learning Architectures for Natural Language Processing Tasks

A BERT-GRU Model for Measuring the Similarity of Arabic Text

Fine-tuning BERT, DistilBERT, XLM-RoBERTa and Ukr-RoBERTa models for sentiment analysis of ukrainian language reviews

Data Set and Benchmark (MedGPTEval) to Evaluate Responses From Large Language Models in Medicine: Evaluation Development and Validation.

Domain-Specific Few-Shot Table Prompt Question Answering via Contrastive Exemplar Selection