BERT-based Model Research Articles

Procedural coding presents a taxing challenge for clinicians. However, recent advances in natural language processing offer a promising avenue for developing applications that assist clinicians, thereby alleviating their administrative burdens. This study seeks to create an application capable of predicting procedure codes by analysing clinicians' operative notes, aiming to streamline their workflow and enhance efficiency. We downstreamed an existing and a native German medical BERT model in a secondary use scenario, utilizing already coded surgery notes to model the coding procedure as a multi-label classification task. In comparison to the transformer-based architecture, we were levering the non-contextual model fastText, a convolutional neural network, a support vector machine and logistic regression for a comparative analysis of possible coding performance. About 350,000 notes were used for model adaption. By considering the top five suggested procedure codes from medBERT.de, surgeryBERT.at, fastText, a convolutional neural network, a support vector machine and a logistic regression, the mean average precision achieved was 0.880, 0.867, 0.870, 0.851, 0.870and0.805 respectively. Support vector machines performed better for surgery reports with a sequence length greater than 512, achieving a mean average precision of 0.872 in comparison to 0.840 for fastText, 0.837 for medBERT.de and 0.820 for surgeryBERT.at. A prototypical front-end application for coding support was additionally implemented. The problem of predicting procedure codes from a given operative report can be successfully modelled as a multi-label classification task, with a promising performance. Support vector machines as a classical machine learning method outperformed the non-contextual fastText approach. FastText with less demanding hardware resources has reached a similar performance to BERT-based models and has shown to be more suitable for explaining the predictions efficiently.

Read full abstract

Due to the scarcity of available annotations in the biomedical domain, clinical natural language processing poses a substantial challenge, especially when applied to low-resource languages. This paper presents our contributions for the detection and normalization of clinical entities corresponding to symptoms, signs, and findings present in multilingual clinical texts. For this purpose, the three subtasks proposed in the SympTEMIST shared task of the Biocreative VIII conference have been addressed. For Subtask 1-named entity recognition in a Spanish corpus-an approach focused on BERT-based model assemblies pretrained on a proprietary oncology corpus was followed. Subtasks 2 and 3 of SympTEMIST address named entity linking (NEL) in Spanish and multilingual corpora, respectively. Our approach to these subtasks followed a classification strategy that starts from a bi-encoder trained by contrastive learning, for which several SapBERT-like models are explored. To apply this NEL approach to different languages, we have trained these models by leveraging the knowledge base of domain-specific medical concepts in Spanish supplied by the organizers, which we have translated into the other languages of interest by using machine translation tools. The results obtained in the three subtasks establish a new state of the art. Thus, for Subtask 1 we obtain precision results of 0.804, F1-score of 0.748, and recall of 0.699. For Subtask 2, we obtain performance gains of up to 5.5% in top-1 accuracy when the trained bi-encoder is followed by a WNT-softmax classification layer that is initialized with the mean of the embeddings of a subset of SNOMED-CT terms. For Subtask 3, the differences are even more pronounced, and our multilingual bi-encoder outperforms the other models analyzed in all languages except Swedish when combined with a WNT-softmax classification layer. Thus, the improvements in top-1 accuracy over the best bi-encoder model alone are 13% for Portuguese and 13.26% for Swedish. Database URL: https://doi.org/10.1093/database/baae087.

Read full abstract

BERT-based Model Research Articles

Related Topics

Articles published on BERT-based Model

Deep active learning for multi label text classification.

Multi-label text classification via secondary use of large clinical real-world data sets.

StructmRNA a BERT based model with dual level and conditional masking for mRNA representation

Comparing SMILES and SELFIES tokenization for enhanced chemical language modeling

Generalizable and automated classification of TNM stage from pathology reports with external validation

Treatment Recommendation using BERT Personalization

Improving Text Classification in Agricultural Expert Systems with a Bidirectional Encoder Recurrent Convolutional Neural Network

Extracting Sentence Embeddings from Pretrained Transformer Models

Honesty is the Best Policy: On the Accuracy of Apple Privacy Labels Compared to Apps' Privacy Policies

Automatic detection of problem-gambling signs from online texts using large language models.

BERT-based models for classifying multi-dialect Arabic texts

Integrating deep learning architectures for enhanced biomedical relation extraction: a pipeline approach.

Evaluating Named Entity Recognition: A comparative analysis of mono- and multilingual transformer models on a novel Brazilian corporate earnings call transcripts dataset

Recognition and normalization of multilingual symptom entities using in-domain-adapted BERT models and classification layers.

Refining CVE-to-CWE mapping with enhanced attention in BERT-based models

Evaluation of BERT-Based Models on Patient Data from French Social Media.

Beyond Tokens: Fair Evaluation of French Large Language Models for Clinical Named Entity Recognition.

Introducing MagBERT: A language model for magnesium textual data mining and analysis

Automated Identification of Fall-related Injuries in Unstructured Clinical Notes.

Applying natural language processing to patient messages to identify depression concerns in cancer patients.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

BERT-based Model Research Articles

Related Topics

Articles published on BERT-based Model

Deep active learning for multi label text classification.

Multi-label text classification via secondary use of large clinical real-world data sets.

StructmRNA a BERT based model with dual level and conditional masking for mRNA representation

Comparing SMILES and SELFIES tokenization for enhanced chemical language modeling

Generalizable and automated classification of TNM stage from pathology reports with external validation

Treatment Recommendation using BERT Personalization

Improving Text Classification in Agricultural Expert Systems with a Bidirectional Encoder Recurrent Convolutional Neural Network

Extracting Sentence Embeddings from Pretrained Transformer Models

Honesty is the Best Policy: On the Accuracy of Apple Privacy Labels Compared to Apps' Privacy Policies

Automatic detection of problem-gambling signs from online texts using large language models.

BERT-based models for classifying multi-dialect Arabic texts

Integrating deep learning architectures for enhanced biomedical relation extraction: a pipeline approach.

Evaluating Named Entity Recognition: A comparative analysis of mono- and multilingual transformer models on a novel Brazilian corporate earnings call transcripts dataset

Recognition and normalization of multilingual symptom entities using in-domain-adapted BERT models and classification layers.

Refining CVE-to-CWE mapping with enhanced attention in BERT-based models

Evaluation of BERT-Based Models on Patient Data from French Social Media.

Beyond Tokens: Fair Evaluation of French Large Language Models for Clinical Named Entity Recognition.

Introducing MagBERT: A language model for magnesium textual data mining and analysis

Automated Identification of Fall-related Injuries in Unstructured Clinical Notes.

Applying natural language processing to patient messages to identify depression concerns in cancer patients.