Abstract

BackgroundA better understanding of the mechanisms of an enzyme's functionality and stability, as well as knowledge and impact of mutations is crucial for researchers working with enzymes. Though, several of the enzymes' databases are currently available, scientific literature still remains at large for up-to-date source of learning the effects of a mutation on an enzyme. However, going through vast amounts of scientific documents to extract the information on desired mutation has always been a time consuming process. In this paper, therefore, we describe an unique method, termed as EnzyMiner, which automatically identifies the PubMed abstracts that contain information on the impact of a protein level mutation on the stability and/or the activity of a given enzyme.ResultsWe present an automated system which identifies the abstracts that contain an amino-acid-level mutation and then classifies them according to the mutation's effect on the enzyme. In the case of mutation identification, MuGeX, an automated mutation-gene extraction system has an accuracy of 93.1% with a 91.5 F-measure. For impact analysis, document classification is performed to identify the abstracts that contain a change in enzyme's stability or activity resulting from the mutation. The system was trained on lipases and tested on amylases with an accuracy of 85%.ConclusionEnzyMiner identifies the abstracts that contain a protein mutation for a given enzyme and checks whether the abstract is related to a disease with the help of information extraction and machine learning techniques. For disease related abstracts, the mutation list and direct links to the abstracts are retrieved from the system and displayed on the Web. For those abstracts that are related to non-diseases, in addition to having the mutation list, the abstracts are also categorized into two groups. These two groups determine whether the mutation has an effect on the enzyme's stability or functionality followed by displaying these on the web.

Highlights

  • J Bioinformatics and Computational Biology 2007, 5(6):v-vii.34

  • EnzyMiner identifies the abstracts that contain a protein mutation for a given enzyme and checks whether the abstract is related to a disease with the help of information extraction and machine learning techniques

  • The mutation list and direct links to the abstracts are retrieved from the system and displayed on the Web

Read more

Summary

Introduction

J Bioinformatics and Computational Biology 2007, 5(6):v-vii.34. PubMed [http://www.ncbi.nlm.nih.gov/pubmed/] 35. A better understanding of the mechanisms of an enzyme's functionality and stability, as well as knowledge and impact of mutations is crucial for researchers working with enzymes. BMC Bioinformatics 2009, 10(Suppl 8):S2 http://www.biomedcentral.com/1471-2105/10/S8/S2 stable conditions, such as optimum temperature and pH. Though many databases are available on the nomenclature of enzymes [3,4] or structure and function [5,6,7,8,9,10,11], to our knowledge only BRENDA (BRaunschweig ENzyme Database) [12,13], the largest manually curated enzyme-specific information system, contains an information on engineered enzymes and their effects on the enzyme's catalytic activity while directly referring to scientific literature. There is a need for an efficient automatic extraction method that allows accessing relevant information rapidly with great efficiency, and possibly at any time

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call