Abstract

Many previous biological event-extraction systems were based on hand-crafted rules which were specifically tuned to a specific biological application domain. But manually constructing and tuning the rules are time-consuming processes and make the systems less portable. So supervised machine-learning methods were developed to generate the extraction rules automatically, but accepting the trade-off between precision and recall (high recall with low precision, and vice versa) is a barrier to improving performance. To make matters worse, a text in the biological domain is more complex because it often contains more than two biological events in a sentence, and one event in a noun chunk can be an entity for the other event. As a result, there are as yet no systems that give a good performance in extracting events in biological domains by using supervised machine learning.To overcome the limitations of previous systems and the complexity of biological texts, we present the following new ideas. First, we adopted a supervised machine-learning method to reduce the human effort in making extraction rules in order to obtain a highly domain-portable system. Second, we overcame the classical trade-off between precision and recall by using an event component verification method. Thus, machine learning occurs in two phases in our architecture. In the first phase, the system focuses on improving recall in extracting events between biological entities during a supervised machine-learning period. After extracting the biological events with automatically learned rules, in the second phase the system removes incorrect biological events by verifying the extracted event components with a maximum entropy (ME) classification method. In other words, the system targets for high recall in the first phase and tries to achieve high precision with a classifier in the second phase. Finally, we improved a supervised machine-learning algorithm so that it could learn a rule in a noun chunk and a rule extending throughout a sentence at two different levels, separately, for nested biological events.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.