Abstract
Drug-Named Entity Recognition (DNER) for biomedical literature is a fundamental facilitator of Information Extraction. For this reason, the DDIExtraction2011 (DDI2011) and DDIExtraction2013 (DDI2013) challenge introduced one task aiming at recognition of drug names. State-of-the-art DNER approaches heavily rely on hand-engineered features and domain-specific knowledge which are difficult to collect and define. Therefore, we offer an automatic exploring words and characters level features approach: a recurrent neural network using bidirectional long short-term memory (LSTM) with Conditional Random Fields decoding (LSTM-CRF). Two kinds of word representations are used in this work: word embedding, which is trained from a large amount of text, and character-based representation, which can capture orthographic feature of words. Experimental results on the DDI2011 and DDI2013 dataset show the effect of the proposed LSTM-CRF method. Our method outperforms the best system in the DDI2013 challenge.
Highlights
With the rapid development of life science and technology, biomedical literature has increased exponentially
The biomedical literature repository, MEDLINE, collects over 9.6 billions records and grows at 30–50 million records a year. This literature contains vast amounts of potential medical information which could be useful to biomedical research, industrial medicine manufacturing, and so forth
Some similar methods have been applied to improve the performance of stochastic gradient descent (SGD), like Adadelta [29], Adam [30] and RMSProp [31]
Summary
With the rapid development of life science and technology, biomedical literature has increased exponentially. The biomedical literature repository, MEDLINE, collects over 9.6 billions records and grows at 30–50 million records a year (https://www.ncbi.nlm.nih.gov/guide/literature/). This literature contains vast amounts of potential medical information which could be useful to biomedical research, industrial medicine manufacturing, and so forth. The first step for extracting potential medical information automatically from the vast amounts of biomedical literature is developing a named entity recognition system. This is a crucial part of some therapeutic relation extraction systems or applications, such as drug-drug interactions [1] and adverse drug reactions [2]. Our research interest in DNER mainly comes from two driving reasons: Firstly, new drugs are rapidly and constantly discovered
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.