Pre-trained Model Research Articles

Pre-trained Large Language Models (LLMs) have revolutionised Natural Language Processing (NLP) tasks, but often struggle when applied to specialised domains such as healthcare. The traditional approach of pre-training on large datasets followed by task-specific fine-tuning is resource-intensive and poorly aligned with the constraints of many healthcare settings. This presents a significant challenge for deploying LLM-based NLP solutions in medical contexts, where data privacy, computational resources, and domain-specific language pose unique obstacles.This study aims to develop and evaluate efficient methods for adapting smaller LLMs to healthcare-specific datasets and tasks. We seek to identify pre-training approaches that can effectively instil healthcare competency in compact LLMs under tight computational budgets, a crucial capability for responsible and sustainable deployment in local healthcare settings.We explore three specialised pre-training methods to adapt smaller LLMs to different healthcare datasets: traditional Masked Language modelling (MLM), Deep Contrastive Learning for Unsupervised Textual Representations (DeCLUTR), and a novel approach utilising metadata categories from healthcare settings. These methods are assessed across multiple healthcare datasets, with a focus on downstream document classification tasks. We evaluate the performance of the resulting LLMs through classification accuracy and analysis of the derived embedding spaces.Contrastively trained models consistently outperform other approaches on classification tasks, delivering strong performance with limited labelled data and fewer model parameter updates. While our novel metadata-based pre-training does not further improve classifications across datasets, it yields interesting embedding cluster separability. Importantly, all domain-adapted LLMs outperform their publicly available, general-purpose base models, validating the importance of domain specialisation.This research demonstrates the efficacy of specialised pre-training methods in adapting compact LLMs to healthcare tasks, even under resource constraints. We provide guidelines for pre-training specialised healthcare LLMs and motivate continued inquiry into contrastive objectives. Our findings underscore the potential of these approaches for aligning small LLMs with privacy-sensitive medical tasks, offering a path toward more efficient and responsible NLP deployment in healthcare settings. This work contributes to the broader goal of making advanced NLP techniques accessible and effective in specialised domains, particularly where resource limitations and data sensitivity are significant concerns.

Read full abstract

Objectives: Subarachnoid Hemorrhage (SAH) is a serious neurological emergency case with a higher mortality rate. An automatic SAH detection is needed to expedite and improve identification, aiding timely and efficient treatment pathways. The existence of noisy and dissimilar anatomical structures in NCCT images, limited availability of labeled SAH data, and ineffective training causes the issues of irrelevant features, overfitting, and vanishing gradient issues that make SAH detection a challenging task. Methods: In this work, the water waves dynamic factor and wandering strategy-based Sand Cat Swarm Optimization, namely DWSCSO, are proposed to ensure optimum feature selection while a Parametric Rectified Linear Unit with a Stacked Convolutional Neural Network, referred to as PRSCNN, is developed for classifying grades of SAH. The DWSCSO and PRSCNN surpass current practices in SAH detection by improving feature selection and classification accuracy. DWSCSO is proposed to ensure optimum feature selection, avoiding local optima issues with higher exploration capacity and avoiding the issue of overfitting in classification. Firstly, in this work, a modified region-growing method was employed on the patient Non-Contrast Computed Tomography (NCCT) images to segment the regions affected by SAH. From the segmented regions, the wide range of patterns and irregularities, fine-grained textures and details, and complex and abstract features were extracted from pre-trained models like GoogleNet, Visual Geometry Group (VGG)-16, and ResNet50. Next, the PRSCNN was developed for classifying grades of SAH which helped to avoid the vanishing gradient issue. Results: The DWSCSO-PRSCNN obtained a maximum accuracy of 99.48%, which is significant compared with other models. The DWSCSO-PRSCNN provides an improved accuracy of 99.62% in CT dataset compared with the DL-ICH and GoogLeNet + (GLCM and LBP), ResNet-50 + (GLCM and LBP), and AlexNet + (GLCM and LBP), which confirms that DWSCSO-PRSCNN effectively reduces false positives and false negatives. Conclusions: the complexity of DWSCSO-PRSCNN was acceptable in this research, for while simpler approaches appeared preferable, they failed to address problems like overfitting and vanishing gradients. Accordingly, the DWSCSO for optimized feature selection and PRSCNN for robust classification were essential for handling these challenges and enhancing the detection in different clinical settings.

Read full abstract

Pre-trained Model Research Articles

Related Topics

Articles published on Pre-trained Model

Automatic TNM staging of colorectal cancer radiology reports using pre-trained language models

Ab-amy 2.0: Predicting light chain amyloidogenic risk of therapeutic antibodies based on antibody language model.

PreAlgPro: Prediction of allergenic proteins with pre-trained protein language model and efficient neutral network

CAST: An innovative framework for Cross-dimensional Attention Structure in Transformers

Adaptive class token knowledge distillation for efficient vision transformer

AfroPALM - Afrocentric Palm oil Adulteration Learning Models: an End-to-end Deep Learning Approach for Detection of Palm Oil Adulteration

Advancements in Data Augmentation and Transfer Learning: A Comprehensive Survey to Address Data Scarcity Challenges

Enhancing Prediction Stability and Performance in LIBS Analysis Using Custom CNN Architectures

사전학습 모델 기반 발화 동영상 멀티 모달 감정 인식

Automated anonymization of radiology reports: comparison of publicly available natural language processing and large language models.

Clinical information extraction for lower-resource languages and domains with few-shot learning using pretrained language models and prompting

A Comprehensive Survey on Brain Tumor Detection and Classification Techniques Using Machine Learning and Deep Learning Models

Developing healthcare language model embedding spaces

Cost-Effective Dust Detection on Solar PV Panels through Deep Learning: A Step towards Automated Maintenance Systems

Construction of a combined prognostic model for pancreatic ductal adenocarcinoma based on deep learning and digital pathology images

Speech Emotion Recognition Using Transfer Learning: Integration of Advanced Speaker Embeddings and Image Recognition Models

Detection of Subarachnoid Hemorrhage Using CNN with Dynamic Factor and Wandering Strategy-Based Feature Selection.

Automated Recognition of Submerged Body-like Objects in Sonar Images Using Convolutional Neural Networks

ULMFiT: Universal Language Model Fine-Tuning for Text Classification

Pop-cosmos: Scaleable Inference of Galaxy Properties and Redshifts with a Data-driven Population Model

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Pre-trained Model Research Articles

Related Topics

Articles published on Pre-trained Model

Automatic TNM staging of colorectal cancer radiology reports using pre-trained language models

Ab-amy 2.0: Predicting light chain amyloidogenic risk of therapeutic antibodies based on antibody language model.

PreAlgPro: Prediction of allergenic proteins with pre-trained protein language model and efficient neutral network

CAST: An innovative framework for Cross-dimensional Attention Structure in Transformers

Adaptive class token knowledge distillation for efficient vision transformer

AfroPALM - Afrocentric Palm oil Adulteration Learning Models: an End-to-end Deep Learning Approach for Detection of Palm Oil Adulteration

Advancements in Data Augmentation and Transfer Learning: A Comprehensive Survey to Address Data Scarcity Challenges

Enhancing Prediction Stability and Performance in LIBS Analysis Using Custom CNN Architectures

사전학습 모델 기반 발화 동영상 멀티 모달 감정 인식

Automated anonymization of radiology reports: comparison of publicly available natural language processing and large language models.

Clinical information extraction for lower-resource languages and domains with few-shot learning using pretrained language models and prompting

A Comprehensive Survey on Brain Tumor Detection and Classification Techniques Using Machine Learning and Deep Learning Models

Developing healthcare language model embedding spaces

Cost-Effective Dust Detection on Solar PV Panels through Deep Learning: A Step towards Automated Maintenance Systems

Construction of a combined prognostic model for pancreatic ductal adenocarcinoma based on deep learning and digital pathology images

Speech Emotion Recognition Using Transfer Learning: Integration of Advanced Speaker Embeddings and Image Recognition Models

Detection of Subarachnoid Hemorrhage Using CNN with Dynamic Factor and Wandering Strategy-Based Feature Selection.

Automated Recognition of Submerged Body-like Objects in Sonar Images Using Convolutional Neural Networks

ULMFiT: Universal Language Model Fine-Tuning for Text Classification

Pop-cosmos: Scaleable Inference of Galaxy Properties and Redshifts with a Data-driven Population Model