Abstract

AbstractAutomatic detection of negated content is often a prerequisite in information extraction systems in various domains. In the biomedical domain especially, this task is important because negation plays an important role. In this work, two main contributions are proposed. First, we work with languages which have been poorly addressed up to now: Brazilian Portuguese and French. Thus, we developed new corpora for these two languages which have been manually annotated for marking up the negation cues and their scope. Second, we propose automatic methods based on supervised machine learning approaches for the automatic detection of negation marks and of their scopes. The methods show to be robust in both languages (Brazilian Portuguese and French) and in cross-domain (general and biomedical languages) contexts. The approach is also validated on English data from the state of the art: it yields very good results and outperforms other existing approaches. Besides, the application is accessible and usable online. We assume that, through these issues (new annotated corpora, application accessible online, and cross-domain robustness), the reproducibility of the results and the robustness of the NLP applications will be augmented.

Highlights

  • Detecting negation in texts is one of the unavoidable prerequisites in many information retrieval and extraction tasks

  • When computed on the ∗SEM-2012 data, our BiLSTM-CRF trained with fastText model trained on Conan Doyle’s novels (FT-D) gets a much higher F1 than Li and Lu (2018) in terms of correctly labeled tokens (+1.27 points); we only get slightly higher results for the exact scope match (+0.41 points)

  • All gates have positive impact on the results, the experiments proposed in our work show that the forget gate gives the advantage to the long short-term memory networks (LSTM)

Read more

Summary

Introduction

Detecting negation in texts is one of the unavoidable prerequisites in many information retrieval and extraction tasks. In the case of cohort selection for clinical trials, for instance, it can provide decisive criteria for recruiting a patient or not. It provides crucial information in many situations such as: detecting a patient’s pathologies and co-morbidities, determining a person’s smoking or non-smoking status, detecting whether or not a particular medication has been prescribed or taken, and defining whether a patient is pregnant or not at the time of recruitment.

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call