Abstract
BackgroundBiomedical named entity recognition (Bio-NER) is a fundamental task in handling biomedical text terms, such as RNA, protein, cell type, cell line, and DNA. Bio-NER is one of the most elementary and core tasks in biomedical knowledge discovery from texts. The system described here is developed by using the BioNLP/NLPBA 2004 shared task. Experiments are conducted on a training and evaluation set provided by the task organizers.ResultsOur results show that, compared with a baseline having a 70.09% F1 score, the RNN Jordan- and Elman-type algorithms have F1 scores of approximately 60.53% and 58.80%, respectively. When we use CRF as a machine learning algorithm, CCA, GloVe, and Word2Vec have F1 scores of 72.73%, 72.74%, and 72.82%, respectively.ConclusionsBy using the word embedding constructed through the unsupervised learning, the time and cost required to construct the learning data can be saved.
Highlights
Biomedical named entity recognition (Bio-Named entity recognition (NER)) is a fundamental task in handling biomedical text terms, such as RNA, protein, cell type, cell line, and DNA
Biomedical named entity recognition is very important in language processing of biomedical texts, especially in extracting information of proteins and genes such as RNA or DNA from documents
We compare the performance of recurrent neural network (RNN) and conditional random fields (CRFs) with word embedding
Summary
Biomedical named entity recognition (Bio-NER) is a fundamental task in handling biomedical text terms, such as RNA, protein, cell type, cell line, and DNA. Named entity recognition (NER) assigns a named entity tag to a designated word by using rules and heuristics. The named entity, which presents a human, location, and an organization, should be recognized [1]. Named entity recognition is a task that extracts nominal and numeric information from a document and classifies the word into a person, an organization, or a date category [2]. Biomedical named entity recognition is very important in language processing of biomedical texts, especially in extracting information of proteins and genes such as RNA or DNA from documents. Finding named entities of genes from texts is a very important and difficult task [3].
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.