Abstract

The objective of text mining is to automatically identify, extract, manage, integrate and exploit the information in texts (Ananiadou & McNaught, 2006). In order to understand biological texts, it is not enough to know what the extracted proteins are. Also the interactions between them should be extracted. Therefore, extracting relations between proteins is an important and more advanced text mining task in biological domain. The study of the protein-protein interactions (PPI) is one of the most pressing problems. Characterizing protein interaction pairs is crucial to understand not only the functional role of individual proteins but also the organization of entire biological processes (Krallinger et al., 2007). Several approaches have been applied to PPI pair extraction including purely statistical co-occurrence approaches (De Bruijn & Martin, 2002; Craven, 1999), pattern-matching approaches (Baumgartner Jr. et al., 2007; Ray & Craven, 2001; Hakenberg et al., 2008) and machine learning approaches such as maximum entropy (Grover et al., 2007) and support vector machines (SVMs) (Airola et al., 2008; Bunescu et al., 2005; Zelenko et al., 2003). In this chapter, we propose a Protein-Protein Interaction Pair Extractor (PPIEor) to extract PPI pairs from the biological literature. PPIEor is essentially a SVM for binary classification, which uses a linear kernel and a rich and informative set of features based on linguistic analysis, contextual words, interaction words, interaction patterns, specific domain information and so forth.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.