Abstract

Entity recognition tasks, which aim to utilize the deep learning-based models to identify the agricultural diseases and pests-related nouns such as the names of diseases, pests, and drugs from the texts collected on the internet or input by users, are a fundamental component for agricultural knowledge graph construction and question-answering, which will be implemented as a web application and provide the general public with solutions for agricultural diseases and pest control. Nonetheless, there are still challenges: (1) the polysemous problem needs to be further solved, (2) the quality of the text representation needs to be further enhanced, (3) the performance for rare entities needs to be further improved. We proposed an adversarial contextual embeddings-based model named ACE-ADP for named entity recognition in Chinese agricultural diseases and pests domain (CNER-ADP). First, we enhanced the text representation and overcame the polysemy problem by using the fine-tuned BERT model to generate the contextual character-level embedded representation with the specific knowledge. Second, adversarial training was also introduced to enhance the generalization and robustness in terms of identifying the rare entities. The experimental results showed that our model achieved an F1 of 98.31% with 4.23% relative improvement compared to the baseline model (i.e., word2vec-based BiLSTM-CRF) on the self-annotated corpus named Chinese named entity recognition dataset for agricultural diseases and pests (AgCNER). Besides, the ablation study and discussion demonstrated that ACE-ADP could not only effectively extract rare entities but also maintain a powerful ability to predict new entities in new datasets with high accuracy. It could be used as a basis for further research on other domain-specific named entity recognition.

Highlights

  • Agricultural diseases and pests (ADPs) are one of the major disasters in the world

  • Note that we exploited the finetuning bidirectional encoder representation from transformers (BERT) for IDCNN, Gated Convolutional Neural Networks (CNN), and AR-CCNER [3] to obtain the best results, and others were set according to their original papers

  • Compared with IDCNN, Gated CNN tended to achieve slightly better F1 on AgCNER and Resume, which benefits from the gated structure that can filter useful features according to their importance

Read more

Summary

Introduction

Agricultural diseases and pests (ADPs) are one of the major disasters in the world. According to the statistics from the Food and Agriculture Organization of the United Nations (FAO), the global annual economic loss caused by ADPs exceeds US$290 billion [1]. With the rapid development of the Internet, agricultural diseases and pests-related text data have shown explosive growth, but it is difficult to be directly recognized and used by computers because of its irregularities and unstructured. The knowledge graph is essentially a semantic web, which can integrate scattered, irregular, and unstructured text data into the agricultural knowledge base. As the basic component of knowledge graph construction and question answering, the named entity recognition task is applied into digital agriculture by some knowledge graph-based human-computer diagnostic systems

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call