AI-Powered GPT-Based University-Specific Chat Assistant: UniROBO

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

AI-powered chat applications are innovative solutions that facilitate user interaction and information access. These applications improve user experience by providing personalized and context-sensitive responses thanks to large language models and natural language processing techniques. This study examined the design and development process of UniRobo, an AI-powered chat application developed for Malatya Turgut Ozal University students and staff. UniRobo is an application that provides instant information on topics such as education, food menus, and campus events and offers personalized responses using large language models and natural language processing techniques. In the development process, based on user needs analysis, the mobile application was created with React Native, the back-end was created with Python and FastAPI, and the MongoDB database was integrated. Artificial intelligence capabilities were supported by OpenAI API and fine-tuning, thus adapting to university-specific content. Retrieval-Augmented Generation (RAG) architecture and Azure AI Search technology increased user satisfaction by providing more accurate and faster responses. As a result, UniRobo has made university life more accessible, providing users with access to fast and accurate information, and has demonstrated the potential of artificial intelligence-based solutions in the education sector.

Similar Papers
  • Research Article
  • 10.1093/ndt/gfae069.792
#2924 Comparison of large language models and traditional natural language processing techniques in predicting arteriovenous fistula failure
  • May 23, 2024
  • Nephrology Dialysis Transplantation
  • Suman Lama + 6 more

Background and Aims Large language models (LLMs) have gained significant attention in the field of natural language processing (NLP), marking a shift from traditional techniques like Term Frequency-Inverse Document Frequency (TF-IDF). We developed a traditional NLP model to predict arteriovenous fistula (AVF) failure within next 30 days using clinical notes. The goal of this analysis was to investigate whether LLMs would outperform traditional NLP techniques, specifically in the context of predicting AVF failure within the next 30 days using clinical notes. Method We defined AVF failure as the change in status from active to permanently unusable status or temporarily unusable status. We used data from a large kidney care network from January 2021 to December 2021. Two models were created using LLMs and traditional TF-IDF technique. We used “distilbert-base-uncased”, a distilled version of BERT base model [1], and compared its performance with traditional TF-IDF-based NLP techniques. The dataset was randomly divided into 60% training, 20% validation and 20% test dataset. The test data, comprising of unseen patients’ data was used to evaluate the performance of the model. Both models were evaluated using metrics such as area under the receiver operating curve (AUROC), accuracy, sensitivity, and specificity. Results The incidence of 30 days AVF failure rate was 2.3% in the population. Both LLMs and traditional showed similar overall performance as summarized in Table 1. Notably, LLMs showed marginally better performance in certain evaluation metrics. Both models had same AUROC of 0.64 on test data. The accuracy and balanced accuracy for LLMs were 72.9% and 59.7%, respectively, compared to 70.9% and 59.6% for the traditional TF-IDF approach. In terms of specificity, LLMs scored 73.2%, slightly higher than the 71.2% observed for traditional NLP methods. However, LLMs had a lower sensitivity of 46.1% compared to 48% for traditional NLP. However, it is worth noting that training on LLMs took considerably longer than TF-IDF. Moreover, it also used higher computational resources such as utilization of graphics processing units (GPU) instances in cloud-based services, leading to higher cost. Conclusion In our study, we discovered that advanced LLMs perform comparably to traditional TF-IDF modeling techniques in predicting the failure of AVF. Both models demonstrated identical AUROC. While specificity was higher in LLMs compared to traditional NLP, sensitivity was higher in traditional NLP compared to LLMs. LLM was fine-tuned with a limited dataset, which could have influenced its performance to be similar to that of traditional NLP methods. This finding suggests that while LLMs may excel in certain scenarios, such as performing in-depth sentiment analysis of patient data for complex tasks, their effectiveness is highly dependent on the specific use case. It is crucial to weigh the benefits against the resources required for LLMs, as they can be significantly more resource-intensive and costly compared to traditional TF-IDF methods. This highlights the importance of a use-case-driven approach in selecting the appropriate NLP technique for healthcare applications.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 17
  • 10.1007/s00607-024-01331-9
Large language models: a new approach for privacy policy analysis at scale
  • Aug 22, 2024
  • Computing
  • David Rodriguez + 3 more

The number and dynamic nature of web sites and mobile applications present regulators and app store operators with significant challenges when it comes to enforcing compliance with applicable privacy and data protection laws. Over the past several years, people have turned to Natural Language Processing (NLP) techniques to automate privacy compliance analysis (e.g., comparing statements in privacy policies with analysis of the code and behavior of mobile apps) and to answer people’s privacy questions. Traditionally, these NLP techniques have relied on labor-intensive and potentially error-prone manual annotation processes to build the corpora necessary to train them. This article explores and evaluates the use of Large Language Models (LLMs) as an alternative for effectively and efficiently identifying and categorizing a variety of data practice disclosures found in the text of privacy policies. Specifically, we report on the performance of ChatGPT and Llama 2, two particularly popular LLM-based tools. This includes engineering prompts and evaluating different configurations of these LLM techniques. Evaluation of the resulting techniques on well-known corpora of privacy policy annotations yields an F1 score exceeding 93%. This score is higher than scores reported earlier in the literature on these benchmarks. This performance is obtained at minimal marginal cost (excluding the cost required to train the foundational models themselves). These results, which are consistent with those reported in other domains, suggest that LLMs offer a particularly promising approach to automated privacy policy analysis at scale.

  • Research Article
  • 10.1111/tgis.70130
A Neuro‐Symbolic GeoAI Framework for Extraction of Travel Routes From Unstructured Texts
  • Oct 28, 2025
  • Transactions in GIS
  • Saydeh Karabatis + 2 more

Unstructured text, such as narratives describing the movement of people, can reveal valuable spatial information that is used to generate the routes people take. However, the lack of precision and the ambiguity of spatial information in these narratives create a significant problem in generating such routes. Existing work uses either traditional natural language processing (NLP) techniques or more recent large language models (LLMs) to extract relevant spatial information. However, traditional NLP techniques do not capture the contextual information in the text, and LLMs are often trained on data with insufficient coverage of developing countries, resulting in incomplete spatial information. This paper proposes a novel neuro‐symbolic GeoAI framework called N arratives a s Geographical R outes (NaR) to automatically extract and visualize geospatial routes from unstructured text and resolve spatial data quality issues in these texts. NaR extracts geographical information from narratives, identifies the toponyms, lists them in temporal order, resolves possible ambiguities, assigns their precise coordinates, and finally depicts the spatial routes on a map. This is achieved through the use of (1) retrieval augmented generation (RAG) techniques that leverage the geographical domain knowledge extracted from NLP techniques in conjunction with a gazetteer to improve the results of LLMs for toponym identification and temporal listing, and (2) a neuro‐symbolic framework that uses symbolic reasoning to resolve toponym ambiguity. Experimental evaluation of our framework indicates that NaR outperforms other existing methods.

  • Research Article
  • Cite Count Icon 35
  • 10.1088/1361-6552/ad1fa2
The impact of AI in physics education: a comprehensive review from GCSE to university levels
  • Feb 6, 2024
  • Physics Education
  • Will Yeadon + 1 more

With the rapid evolution of artificial intelligence (AI), its potential implications for higher education have become a focal point of interest. This study delves into the capabilities of AI in physics education and offers actionable AI policy recommendations. Using openAI’s flagship gpt-3.5-turbo large language model (LLM), we assessed its ability to answer 1337 physics exam questions spanning general certificate of secondary education (GCSE), A-Level, and introductory university curricula. We employed various AI prompting techniques: Zero Shot, in context learning, and confirmatory checking, which merges chain of thought reasoning with reflection. The proficiency of gpt-3.5-turbo varied across academic levels: it scored an average of 83.4% on GCSE, 63.8% on A-Level, and 37.4% on university-level questions, with an overall average of 59.9% using the most effective prompting technique. In a separate test, the LLM’s accuracy on 5000 mathematical operations was found to be 45.2%. When evaluated as a marking tool, the LLM’s concordance with human markers averaged at 50.8%, with notable inaccuracies in marking straightforward questions, like multiple-choice. Given these results, our recommendations underscore caution: while current LLMs can consistently perform well on physics questions at earlier educational stages, their efficacy diminishes with advanced content and complex calculations. LLM outputs often showcase novel methods not in the syllabus, excessive verbosity, and miscalculations in basic arithmetic. This suggests that at university, there’s no substantial threat from LLMs for non-invigilated physics questions. However, given the LLMs’ considerable proficiency in writing physics essays and coding abilities, non-invigilated examinations of these skills in physics are highly vulnerable to automated completion by LLMs. This vulnerability also extends to pysics questions pitched at lower academic levels. It is thus recommended that educators be transparent about LLM capabilities with their students, while emphasizing caution against overreliance on their output due to its tendency to sound plausible but be incorrect.

  • Research Article
  • 10.1007/s44212-025-00083-x
Benchmarking large language models against qualitative coding and natural language processing in decoding public sentiment on urban upzoning
  • Sep 30, 2025
  • Urban Informatics
  • Helena Hang Rong + 2 more

Urban planners routinely engage with extensive textual materials, such as zoning codes, comprehensive plans, and public comments. The large volume of textual data being generated in cities offers new opportunities to use emerging computational technologies to process and analyze textual data. In this research, we compare how natural language processing (NLP) techniques and large language models (LLMs) compare to human qualitative coding techniques to identify public sentiment and topics contained in public comments about the Minneapolis 2040 upzoning. We use a custom rubric developed in collaboration with urban planners to assess outputs across these different methods, scoring outputs on factors such as accuracy, convergence, creativity, efficiency, and interpretability. Additionally, we conduct interviews with practicing urban planners to understand their perceptions of integrating these computational techniques into their existing workflows. We find that using NLP techniques are helpful in providing urban planners with an aerial view of their data, but require additional human interpretation. In contrast, using LLMs markedly improves efficiency, interpretability, and descriptiveness over traditional NLP techniques, but requires human validation to address concerns related to social biases and equity. We further find that urban planners are open to using new text processing technologies, but have reservations about entirely outsourcing decision-making to AI tools, viewing AI technologies more as “co-pilots” rather than autonomous agents. Our findings underscore the importance of integrating human judgment into using computational tools to develop a more informed, equitable, and reflective practice in an era of expanding urban data and computational technologies.

  • Research Article
  • Cite Count Icon 1
  • 10.48175/ijarsct-22330
Health Diagnostic Assistant using LLMs
  • Nov 16, 2024
  • International Journal of Advanced Research in Science, Communication and Technology
  • Laxmikant Malphedwar + 4 more

The Health Diagnostic Assistant leverages advanced Large Language Models (LLMs) and Natural Language Processing (NLP) techniques to enhance patient diagnosis and healthcare decision-making. This innovative system employs Retrieval-Augmented Generation (RAG) to combine the strengths of pre-trained language models with a dynamic retrieval mechanism, allowing it to access and synthesize real-time medical knowledge from a wide array of databases. By analyzing patient symptoms, medical histories, and contextual data, the assistant generates accurate, context-aware recommendations and insights. The project aims to streamline the diagnostic process, reduce the burden on healthcare professionals, and improve patient outcomes by providing evidence-based suggestions tailored to individual cases. Through continuous learning and integration of user feedback, the Health Diagnostic Assistant aspires to evolve into a reliable tool for both patients and clinicians, fostering informed decision-making in the healthcare landscape.

  • Research Article
  • Cite Count Icon 22
  • 10.1016/s2589-7500(23)00179-6
Predicting seizure recurrence after an initial seizure-like episode from routine clinical notes using large language models: a retrospective cohort study.
  • Dec 1, 2023
  • The Lancet Digital Health
  • Brett K Beaulieu-Jones + 9 more

The evaluation and management of first-time seizure-like events in children can be difficult because these episodes are not always directly observed and might be epileptic seizures or other conditions (seizure mimics). We aimed to evaluate whether machine learning models using real-world data could predict seizure recurrence after an initial seizure-like event. This retrospective cohort study compared models trained and evaluated on two separate datasets between Jan 1, 2010, and Jan 1, 2020: electronic medical records (EMRs) at Boston Children's Hospital and de-identified, patient-level, administrative claims data from the IBM MarketScan research database. The study population comprised patients with an initial diagnosis of either epilepsy or convulsions before the age of 21 years, based on International Classification of Diseases, Clinical Modification (ICD-CM) codes. We compared machine learning-based predictive modelling using structured data (logistic regression and XGBoost) with emerging techniques in natural language processing by use of large language models. The primary cohort comprised 14 021 patients at Boston Children's Hospital matching inclusion criteria with an initial seizure-like event and the comparison cohort comprised 15 062 patients within the IBM MarketScan research database. Seizure recurrence based on a composite expert-derived definition occurred in 57% of patients at Boston Children's Hospital and 63% of patients within IBM MarketScan. Large language models with additional domain-specific and location-specific pre-training on patients excluded from the study (F1-score 0·826 [95% CI 0·817-0·835], AUC 0·897 [95% CI 0·875-0·913]) performed best. All large language models, including the base model without additional pre-training (F1-score 0·739 [95% CI 0·738-0·741], AUROC 0·846 [95% CI 0·826-0·861]) outperformed models trained with structured data. With structured data only, XGBoost outperformed logistic regression and XGBoost models trained with the Boston Children's Hospital EMR (logistic regression: F1-score 0·650 [95% CI 0·643-0·657], AUC 0·694 [95% CI 0·685-0·705], XGBoost: F1-score 0·679 [0·676-0·683], AUC 0·725 [0·717-0·734]) performed similarly to models trained on the IBM MarketScan database (logistic regression: F1-score 0·596 [0·590-0·601], AUC 0·670 [0·664-0·675], XGBoost: F1-score 0·678 [0·668-0·687], AUC 0·710 [0·703-0·714]). Physician's clinical notes about an initial seizure-like event include substantial signals for prediction of seizure recurrence, and additional domain-specific and location-specific pre-training can significantly improve the performance of clinical large language models, even for specialised cohorts. UCB, National Institute of Neurological Disorders and Stroke (US National Institutes of Health).

  • Research Article
  • Cite Count Icon 10
  • 10.2196/52462
Automated Category and Trend Analysis of Scientific Articles on Ophthalmology Using Large Language Models: Development and Usability Study.
  • Mar 22, 2024
  • JMIR Formative Research
  • Hina Raja + 14 more

In this paper, we present an automated method for article classification, leveraging the power of large language models (LLMs). The aim of this study is to evaluate the applicability of various LLMs based on textual content of scientific ophthalmology papers. We developed a model based on natural language processing techniques, including advanced LLMs, to process and analyze the textual content of scientific papers. Specifically, we used zero-shot learning LLMs and compared Bidirectional and Auto-Regressive Transformers (BART) and its variants with Bidirectional Encoder Representations from Transformers (BERT) and its variants, such as distilBERT, SciBERT, PubmedBERT, and BioBERT. To evaluate the LLMs, we compiled a data set (retinal diseases [RenD] ) of 1000 ocular disease-related articles, which were expertly annotated by a panel of 6 specialists into 19 distinct categories. In addition to the classification of articles, we also performed analysis on different classified groups to find the patterns and trends in the field. The classification results demonstrate the effectiveness of LLMs in categorizing a large number of ophthalmology papers without human intervention. The model achieved a mean accuracy of 0.86 and a mean F1-score of 0.85 based on the RenD data set. The proposed framework achieves notable improvements in both accuracy and efficiency. Its application in the domain of ophthalmology showcases its potential for knowledge organization and retrieval. We performed a trend analysis that enables researchers and clinicians to easily categorize and retrieve relevant papers, saving time and effort in literature review and information gathering as well as identification of emerging scientific trends within different disciplines. Moreover, the extendibility of the model to other scientific fields broadens its impact in facilitating research and trend analysis across diverse disciplines.

  • Conference Article
  • Cite Count Icon 2
  • 10.5753/sbes.2024.3588
On the Identification of Self-Admitted Technical Debt with Large Language Models
  • Sep 30, 2024
  • Pedro Lambert + 2 more

Self-Admitted Technical Debt (SATD) refers to a common practice in software engineering involving developers explicitly documenting and acknowledging technical debt within their projects. Identifying SATD in various contexts is a key activity for effective technical debt management and resolution. While previous research has focused on natural language processing techniques and specialized models for SATD identification, this study explores the potential of Large Language Models (LLMs) for this task. We compare the performance of three LLMs - Claude 3 Haiku, GPT 3.5 turbo, and Gemini 1.0 pro - against the generalization of the state-of-the-art model designed for SATD identification. Additionally, we investigate the impact of prompt engineering on the performance of LLMs in this context. Our findings reveal that LLMs achieve competitive results compared to the state-of-the-art model. However, when considering the Matthews Correlation Coefficient (MCC), we observe that the LLM performance is less balanced, tending to score lower than the state-of-the-art model across all four confusion matrix categories. Nevertheless, with a well-designed prompt, we conclude that the models’ bias can be improved, resulting in a higher MCC score.

  • Research Article
  • Cite Count Icon 3
  • 10.1017/dsj.2025.7
AI-driven FMEA: integration of large language models for faster and more accurate risk analysis
  • Jan 1, 2025
  • Design Science
  • Ibtissam El Hassani + 3 more

Failure mode and effects analysis (FMEA) is a critical but labor-intensive process in product development that aims to identify and mitigate potential failure modes to ensure product quality and reliability. In this paper, a novel framework to improve the FMEA process by integrating generative artificial intelligence (AI), in particular large language models (LLMs), is presented. By using these advanced AI tools, we aim to streamline collaborative work in FMEA, reduce manual effort and improve the accuracy of risk assessments. The proposed framework includes LLMs to support data collection, pre-processing, risk identification, and decision-making in FMEA. This integration enables a more efficient and reliable analysis process and leverages the strengths of human expertise and AI capabilities. To validate the framework, we conducted a case study where we first used GPT-3.5 as a proof of concept, followed by a comparison of the performance of three well-known LLMs: GPT-4, GPT-4o and Gemini. These comparisons show significant improvements in terms of speed, accuracy, and reliability of FMEA results compared to traditional methods. Our results emphasize the transformative potential of LLMs in FMEA processes and contribute to more robust design and quality assurance practices. The paper concludes with recommendations for future research focusing on data security and the development of domain-specific LLM training protocols.

  • Research Article
  • 10.3991/ijim.v18i23.51419
Future Prospects of Large Language Models: Enabling Natural Language Processing in Educational Robotics
  • Dec 3, 2024
  • International Journal of Interactive Mobile Technologies (iJIM)
  • S Vinoth Kumar + 5 more

Large language models (LLMs) have recently shown considerable promise in educational robotics by offering generic knowledge necessary in situations when prior programming is not possible. In general, mobile education robots cannot perform tasks like navigation or localization unless they have a working knowledge of maps. In this letter, we tackle the issue of making LLMs more applicable in the field of mobile education robots by helping them to understand Space Graph, a text-based map description. This study, which focuses on LLMs, is divided into several sections. It explores basic natural language processing (NLP) techniques and highlights how they can help create smooth education discussions. Examining the development of LLMs inside NLP systems, the paper explores the benefits and implementation issues of important models utilized in the education sector. Applications useful in educational discussions are described in depth, ranging from patient-focused tools like diagnosis and treatment recommendations to systems that support education providers. We provide thorough instructions and real-world examples for quick engineering, making LLM-based educational robotics solutions more accessible to novices. We demonstrate how LLM-guided upgrades can be easily included in education robotics applications using tutorial-level examples and structured prompt creation. This survey provides a thorough review and helpful advice for leveraging language models in automation development, acting as a road map for researchers navigating the rapidly changing field of LLM-driven educational robotics.

  • Research Article
  • Cite Count Icon 1
  • 10.1109/jtehm.2025.3571255
Unstructured Electronic Health Records of Dysphagic Patients Analyzed by Large Language Models
  • Jan 1, 2025
  • IEEE Journal of Translational Engineering in Health and Medicine
  • Luisa Neubig + 3 more

Objective: Dysphagia is a common and complex disorder that complicates both diagnoses and treatment. Consequently, the associated electronic health records (EHR) are often unstructured and complex, posing challenges for systematic data analysis.Methods and procedures: In this study, we employ natural language processing (NLP) techniques and large language models (LLMs) to automatically analyze clinical narratives and extract diagnostic information from a diverse set of EHRs. Our dataset includes medical records from 486 patients, representing a group with diverse dysphagic conditions. We analyze diagnoses provided in unstructured free text that do not follow a standardized structure. We utilize clustering algorithms on the extracted diagnostic features to identify distinct groups of patients who share similar pathophysiological swallowing dysfunctions.Results: We found that basic NLP techniques often provide limited insights due to the high variability of the data. In contrast, LLMs help to bridge the gap in understanding the nuanced medical information about dysphagia and related conditions. Although applying these advanced LLM models is not straightforward, our results demonstrate that leveraging closed-source models can effectively cluster different categories of dysphagia.Conclusion: Our study provides therefore evidence that LLMs are highly promising in future dysphagia research.Clinical impact: Dysphagia is a symptom associated with various diseases, though its underlying relationships remain unclear. This study demonstrates how analyzing large volumes of electronic health records can help clarify the causes of dysphagia and identify contributing factors. By applying natural language processing, we aim to enhance both understanding and treatment, supporting clinical staff in improving individualized care by identifying relevant patient cohorts. Clinical and Translational Impact Statement: This study uses LLMs to efficiently preprocess unstructured EHRs, improving dysphagia diagnosis and patient clustering. It aligns with Clinical Research, enhancing diagnostic speed and enabling personalized treatment.

  • Book Chapter
  • Cite Count Icon 2
  • 10.4018/979-8-3693-2165-2.ch002
AI Voice Assistant for Smartphones With NLP Techniques
  • Apr 19, 2024
  • Fungai Jacqueline Kiwa + 2 more

The AI voice assistant mobile application was developed to aid drivers in operating their mobile phones while driving without touching their phones. The literature review examines multiple innovative artificial technologies involved in applications with voice assistants in natural language processing (NLP) techniques. The methodology used involved a qualitative approach, and the design science paradigm was used for the development of the voice assistant for smartphones with NLP techniques. NLP techniques that were applied in the development of the AI voice assistant are smart synthesis, data flow sequence, core and interface accessing, part of speech tagging, named entity recognition, conference resolution, and porter stemming. Some of the operations that are achieved by the application include arithmetic calculations based on voice commands and returning the computer result via voice, searching the internet based on user voice input, and providing a response via voice assistance.

  • Research Article
  • Cite Count Icon 1
  • 10.11591/ijai.v13.i3.pp3489-3497
Automatic detection of safety requests in web and mobile applications using natural language processing techniques
  • Sep 1, 2024
  • IAES International Journal of Artificial Intelligence (IJ-AI)
  • Salim Salmi + 1 more

<p class="p1">Web and mobile applications have become an essential part of our daily lives. However, as the usage of these applications increases, so does the potential for safety concerns. It is crucial for application developers to ensure that their applications are safe and secure for users. One way to achieve this is through the identification and processing of safety requests made by users. This research paper proposes a method for identifying safety requests made by users in web and mobile applications using natural language processing (NLP) and deep learning techniques. The approach involves training a machine learning and deep learning model on a dataset of user requests to identify and classify safety requests. The models are then integrated into the application’s code to automatically detect and respond to safety requests. A case study on a ride-sharing application showed that the proposed approach achieved high accuracy in identifying safety requests, with an F1 score of 0.85. The proposed method can be applied to vari- ous web and mobile applications to improve safety and security, and reduce the workload of manual safety request processing.</p>

  • Research Article
  • Cite Count Icon 1
  • 10.1145/3712300
PALLM: Evaluating and Enhancing PALL iative Care Conversations with L arge L anguage M odels
  • Jan 23, 2025
  • ACM Transactions on Computing for Healthcare
  • Zhiyuan Wang + 4 more

Effective patient-provider communication is crucial in clinical care, directly impacting patient outcomes and quality of life. Traditional evaluation methods, such as human ratings, patient feedback, and provider self-assessments, are often limited by high costs and scalability issues. Although existing natural language processing (NLP) techniques show promise, they struggle with the nuances of clinical communication and require sensitive clinical data for training, reducing their effectiveness in real-world applications. Emerging large language models (LLMs) offer a new approach to assessing complex communication metrics, with the potential to advance the field through integration into passive sensing and just-in-time intervention systems. This study explores LLMs as evaluators of palliative care communication quality, leveraging their linguistic, in-context learning, and reasoning capabilities. Specifically, using simulated scripts crafted and labeled by healthcare professionals, we test proprietary models (e.g., GPT-4) and fine-tune open-source LLMs (e.g., LLaMA2) with a synthetic dataset generated by GPT-4 to evaluate clinical conversations, to identify key metrics such as ‘understanding’ and ‘empathy’. Our findings demonstrated LLMs’ superior performance in evaluating clinical communication, providing actionable feedback with reasoning, and demonstrating the feasibility and practical viability of developing in-house LLMs. This research highlights LLMs’ potential to enhance patient-provider interactions and lays the groundwork for downstream steps in developing LLM-empowered clinical health systems.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.