인공지능기반 사전학습언어모델 적용방안에 관한 연구

Jae Kwon Bae

doi:10.38115/asgba.2024.21.2.64

Abstract

Pre-trained Language Model(PLM) refers to a natural language processing(NLP) model that has been pre-trained using large amounts of text data. The PLM has the limitation of not being able to understand domain-specific terminology due to a lack of training data for terminology. Therefore, the need for a domain-specific language model modified through BERT- or GPT-based pre-trained learning has recently been emphasized. In this study, we analyze BERT's pre-training method and BERT-based transformation techniques (ALBERT, RoBERTa, ELECTRA) and propose a PLM that can be used in biomedical, financial, and legal domains. The biomedical-specific pre-trained learning model is designed to learn domain-specific language characteristics such as technical terminology, medical sentence structure, and medical entity name recognition in the biomedical field. It is mainly adjusted to be applied to biomedical tasks through transfer learning based on BERT's pre-training method and architecture. For this purpose, it is pre-trained with pre-trained biomedical text data, and this pre-training transfers domain-specific knowledge to the model through learning representations for biomedical-related texts. The finance-specific pre-trained learning model is a model that can understand and process financial terminology, financial market trends, and sentence structures and vocabulary related to financial products and services. It can be used to generate news articles about financial market trends and to extract key information by concisely summarizing long texts such as financial reports and corporate press releases. Additionally, finance-specific pre-trained models help financial analysts generate investment recommendations based on a company's financial condition, performance, and prospects. The legal-specific pre-trained model is a language model suitable for legal documents and is used for legal document classification, legal document summarization, and legal document similarity evaluation. The legal-specific pre-learning model was created by pre-training the BERT model on special texts in the legal field, and through this, it learns characteristics specialized for legal documents. The performance of the legal-specific pre-training model can be improved to solve legal-related tasks through scratch pre-training and additional pre-training using legal corpora.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

인공지능기반 사전학습언어모델 적용방안에 관한 연구

Abstract

Talk to us

Similar Papers

More From: The Academic Society of Global Business Administration

Lead the way for us

Similar Papers

Towards an Enhanced Understanding of Bias in Pre-trained Neural Language Models: A Survey with Special Emphasis on Affective Bias
Anoop K ... Lajish V L
-
Anoop K, et. al. Anoop K ... Lajish V L
01 Jan 2021
01 Jan 2021

On the Power of Pre-Trained Text Representations
Yu Meng ... Jiawei Han
-
Yu Meng, et. al.Yu Meng ... Jiawei Han
14 Aug 2021
14 Aug 2021

A Comparison of Pre-Trained Language Models for Multi-Class Text Classification in the Financial Domain
Yusuf Arslan ... Lisa Veiber
-
Yusuf Arslan, et. al.Yusuf Arslan ... Lisa Veiber
19 Apr 2021
19 Apr 2021

Neural Transfer Learning For Vietnamese Sentiment Analysis Using Pre-trained Contextual Language Models
An Pha Le ... Tran Vu Pham
-
An Pha Le, et. al.An Pha Le ... Tran Vu Pham
16 Dec 2021
16 Dec 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

인공지능기반 사전학습언어모델 적용방안에 관한 연구

Abstract

Talk to us

Similar Papers

More From: The Academic Society of Global Business Administration