Abstract

Named Entity Recognition (NER) is a fundamental but crucial task in natural language processing (NLP) and big data analysis, with wide application range. NER for rice genes and phenotypes is a technique to identify genes and phenotypes from a large amount of text. NER for rice genes and phenotypes can facilitate the acquisition of information in the field of crops and provide references for our research on higher quality crops. At the same time, named entity recognition still faces many challenges. In this paper, we propose an improved bidirectional gated recurrent unit neural network (BI-GRU) method, which is used to automatically identify the required entities (i.e. gene names, rice phenotypes) from relevant rice literature and patents. The neural network model is combined with the Softmax function to directly output the probabilities of labels, forming the BI-GRU-SF model. With the ability of deep learning methods, the semantic information in the context can be learned without the need for feature engineering. Finally, we conducted experiments, and the results showed that our proposed model provided better performance compared to other models. All datasets and resource codes of BI-GRU-SF are available at https://github.com/qqeeqq/NER for academic use.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call