Abstract

Multilabel classification is one of the most challenging tasks in natural language processing, posing greater technical difficulties than single-label classification. At the same time, multilabel classification has more natural applications. For individual labels, the whole piece of text has different focuses or component distributions, which require full use of local information of the sentence. As a widely adopted mechanism in natural language processing, attention becomes a natural choice for the issue. This paper proposes a multilayer self-attention model to deal with aspect category and word attention at different granularities. Combined with the BERT pretraining model, it achieves competitive performance in aspect category detection and electronic medical records’ classification.

Highlights

  • Sentiment analysis is the heart of many business and social applications [2, 3]

  • We propose a BERT-based multi-self-attention model (BERT-MSA) for multilabel classification (MLC). e self-attention mechanism is used to capture the information of each category

  • For the aspect category detection (ACD) task, with extensive experiments on subtask 3 of SemEval-2014 task 4 [4], the results indicate that our BERT-MSA model is superior to other baseline methods in aspect category sentiment analysis

Read more

Summary

Introduction

Sentiment analysis is the heart of many business and social applications [2, 3]. ACD is a subtask of aspect category sentiment analysis [4] and can be treated as an MLC task. Comment “While it was large and a bit noisy, the drinks were fantastic, and the food was superb” evaluates the restaurant’s “environment” and “food,” which belong to the predefined categories in the dataset. Mining such information from comments is of great help to improve user experience and product quality. An electronic medical record may include diagnosis, treatment, surgery, and many other aspects, such as “Before admission, the patient was admitted to our hospital due to increased bowel movements. Classifying electronic medical records is a multilabel text classification task. We believe that the vanilla BERT model is unable to capture key information in each category, especially when the correlation between each label is strong

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.