Label Ambiguity Research Articles

ObjectiveTraining a neural network-based biomedical named entity recognition (BioNER) model usually requires extensive and costly human annotations. While several studies have employed multi-task learning with multiple BioNER datasets to reduce human effort, this approach does not consistently yield performance improvements and may introduce label ambiguity in different biomedical corpora. We aim to tackle those challenges through transfer learning from easily accessible resources with fewer concept overlaps with biomedical datasets. MethodsWe proposed GERBERA, a simple-yet-effective method that utilized general-domain NER datasets for training. We performed multi-task learning to train a pre-trained biomedical language model with both the target BioNER dataset and the general-domain dataset. Subsequently, we fine-tuned the models specifically for the BioNER dataset. ResultsWe systematically evaluated GERBERA on five datasets of eight entity types, collectively consisting of 81,410 instances. Despite using fewer biomedical resources, our models demonstrated superior performance compared to baseline models trained with additional BioNER datasets. Specifically, our models consistently outperformed the baseline models in six out of eight entity types, achieving an average improvement of 0.9% over the best baseline performance across eight entities. Our method was especially effective in amplifying performance on BioNER datasets characterized by limited data, with a 4.7% improvement in F1 scores on the JNLPBA-RNA dataset. ConclusionThis study introduces a new training method that leverages cost-effective general-domain NER datasets to augment BioNER models. This approach significantly improves BioNER model performance, making it a valuable asset for scenarios with scarce or costly biomedical datasets. We make data, codes, and models publicly available via https://github.com/qingyu-qc/bioner_gerbera.

Read full abstract

Large labeled datasets are crucial for video understanding progress. However, the labeling process is time-consuming, expensive, and tiresome. To overcome this impediment, various pretexts use the temporal coherence in videos to learn visual representations in a self-supervised manner. However, these pretexts (order verification and sequence sorting) struggle when encountering cyclic actions due to the label ambiguity problem. To overcome these limitations, we present a novel temporal pretext task to address self-supervised learning of visual representations from unlabeled videos. Repeated Scene Localization (RSL) is a multi-class classification pretext that involves changing the temporal order of the frames in a video by repeating a scene. Then, the network is trained to identify the modified video, localize the location of the repeated scene, and identify the unmodified original videos that do not have repeated scenes. We evaluated the proposed pretext on two benchmark datasets, UCF-101 and HMDB-51. The experimental results show that the proposed pretext achieves state-of-the-art results in action recognition and video retrieval tasks. In action recognition, our S3D model achieves 88.15% and 56.86% on UCF-101 and HMDB-51, respectively. It outperforms the current state-of-the-art by 1.05% and 3.26%. Our R(2+1)D-Adjacent model achieves 83.52% and 54.50% on UCF-101 and HMDB-51, respectively. It outperforms the single pretext tasks by 8.7% and 13.9%. In video retrieval, our R(2+1)D-Offset model outperforms the single pretext tasks by 4.68% and 1.1% Top 1 accuracies on UCF-101 and HMDB-51, respectively. The source code and the trained models are publicly available at https://github.com/Hussein-A-Hassan/RSL-Pretext.

Read full abstract

Label Ambiguity Research Articles

Related Topics

Articles published on Label Ambiguity

Large Margin Weighted k-Nearest Neighbors Label Distribution Learning for Classification.

Augmenting biomedical named entity recognition with general-domain resources

Retrieval-style In-context Learning for Few-shot Hierarchical Text Classification

Decoupling Implantation Prediction and Embryo Ranking in Machine Learning: The Impact of Clinical Data and Discarded Embryos

Adaptive ambiguity-aware weighting for multi-label recognition with limited annotations

Feature Selection for Handling Label Ambiguity Using Weighted Label-Fuzzy Relevancy and Redundancy

Incomplete label distribution learning via label correlation decomposition

Label distribution feature selection based on hierarchical structure and neighborhood granularity

Repeat and learn: Self-supervised visual representations learning by Repeated Scene Localization

Stream label distribution learning processing via broad learning system

HE-Mind: A model for automatically predicting hematoma expansion after spontaneous intracerebral hemorrhage

High levels of mislabelling of shark flesh in Australian fish markets and seafood shops

Label disambiguation-based feature selection for partial label learning via fuzzy dependency and feature discernibility

Learning from feature and label spaces’ bias for uncertainty-adaptive facial emotion recognition

Retrospective and prospective study designs

Generative Calibration of Inaccurate Annotation for Label Distribution Learning

Partial Sequence Labeling With Structured Gaussian Processes.

The third way: object reordering as ambiguous labeling resolution

RGBT Tracking via Challenge-Based Appearance Disentanglement and Interaction.

Modeling Uncertainty for Low-Resolution Facial Expression Recognition

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Label Ambiguity Research Articles

Related Topics

Articles published on Label Ambiguity

Large Margin Weighted k-Nearest Neighbors Label Distribution Learning for Classification.

Augmenting biomedical named entity recognition with general-domain resources

Retrieval-style In-context Learning for Few-shot Hierarchical Text Classification

Decoupling Implantation Prediction and Embryo Ranking in Machine Learning: The Impact of Clinical Data and Discarded Embryos

Adaptive ambiguity-aware weighting for multi-label recognition with limited annotations

Feature Selection for Handling Label Ambiguity Using Weighted Label-Fuzzy Relevancy and Redundancy

Incomplete label distribution learning via label correlation decomposition

Label distribution feature selection based on hierarchical structure and neighborhood granularity

Repeat and learn: Self-supervised visual representations learning by Repeated Scene Localization

Stream label distribution learning processing via broad learning system

HE-Mind: A model for automatically predicting hematoma expansion after spontaneous intracerebral hemorrhage

High levels of mislabelling of shark flesh in Australian fish markets and seafood shops

Label disambiguation-based feature selection for partial label learning via fuzzy dependency and feature discernibility

Learning from feature and label spaces’ bias for uncertainty-adaptive facial emotion recognition

Retrospective and prospective study designs

Generative Calibration of Inaccurate Annotation for Label Distribution Learning

Partial Sequence Labeling With Structured Gaussian Processes.

The third way: object reordering as ambiguous labeling resolution

RGBT Tracking via Challenge-Based Appearance Disentanglement and Interaction.

Modeling Uncertainty for Low-Resolution Facial Expression Recognition