Improved mutation tagging with gene identifiers applied to membrane protein stability prediction

Rainer Winnenburg,Conrad Plake,Michael Schroeder

doi:10.1186/1471-2105-10-s8-s3

Abstract

BackgroundThe automated retrieval and integration of information about protein point mutations in combination with structure, domain and interaction data from literature and databases promises to be a valuable approach to study structure-function relationships in biomedical data sets.ResultsWe developed a rule- and regular expression-based protein point mutation retrieval pipeline for PubMed abstracts, which shows an F-measure of 87% for the mutation retrieval task on a benchmark dataset. In order to link mutations to their proteins, we utilize a named entity recognition algorithm for the identification of gene names co-occurring in the abstract, and establish links based on sequence checks. Vice versa, we could show that gene recognition improved from 77% to 91% F-measure when considering mutation information given in the text. To demonstrate practical relevance, we utilize mutation information from text to evaluate a novel solvation energy based model for the prediction of stabilizing regions in membrane proteins. For five G protein-coupled receptors we identified 35 relevant single mutations and associated phenotypes, of which none had been annotated in the UniProt or PDB database. In 71% reported phenotypes were in compliance with the model predictions, supporting a relation between mutations and stability issues in membrane proteins.ConclusionWe present a reliable approach for the retrieval of protein mutations from PubMed abstracts for any set of genes or proteins of interest. We further demonstrate how amino acid substitution information from text can be utilized for protein structure stability studies on the basis of a novel energy model.

Highlights

The automated retrieval and integration of information about protein point mutations in combination with structure, domain and interaction data from literature and databases promises to be a valuable approach to study structure-function relationships in biomedical data sets
We present a reliable approach for the retrieval of protein mutations from PubMed abstracts for any set of genes or proteins of interest
We further demonstrate how amino acid substitution information from text can be utilized for protein structure stability studies on the basis of a novel energy model

Summary

Introduction

The automated retrieval and integration of information about protein point mutations in combination with structure, domain and interaction data from literature and databases promises to be a valuable approach to study structure-function relationships in biomedical data sets. Malfunctions or alterations in such pathways can be the cause of many diseases, when for instance the biosynthesis of involved proteins is repressed or proteins are not interacting the way they should. The latter can be due to structural changes in one of the interacting proteins, caused by point mutations, i.e. single wild type amino acid substitutions. Despite the availability of numerous biomedical data collections, valuable information about mutation-phenotype associations is still hidden in non-structured text in the biomedical literature. This knowledge can be extracted by text mining, stored in a homogeneous data store, and integrated with already available data from suitable databases. New hypotheses can be formulated, such as the prediction of phenotypic effects induced by mutations

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Aug 1, 2009
Citations: 59	License type: cc-by

R Discovery Prime

R Discovery Prime

Improved mutation tagging with gene identifiers applied to membrane protein stability prediction

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Structural Studies on Large Fragments of G Protein Coupled Receptors
Fred Naider ... Alexey Neumoin
-
Fred Naider, et. al.Fred Naider ... Alexey Neumoin
01 Jan 2009
01 Jan 2009

How are exons encoding transmembrane sequences distributed in the exon-intron structure of genes?
Ryusuke Sawada ... Shigeki Mitaku
Genes to Cells | VOL. 16
Ryusuke Sawada, et. al.Ryusuke Sawada ... Shigeki Mitaku
09 Dec 2010
Genes to Cells | VOL. 16

BRANEart: Identify Stability Strength and Weakness Regions in Membrane Proteins.
Sankar Basu ... Marianne Rooman
Frontiers in Bioinformatics | VOL. 1
Sankar Basu, et. al.Sankar Basu ... Marianne Rooman
02 Dec 2021
Frontiers in Bioinformatics | VOL. 1

Drugging Membrane Protein Interactions.
Hang Yin ... Aaron D Flynn
Annual Review of Biomedical Engineering | VOL. 18
Hang Yin, et. al.Hang Yin ... Aaron D Flynn
05 Feb 2016
Annual Review of Biomedical Engineering | VOL. 18

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improved mutation tagging with gene identifiers applied to membrane protein stability prediction

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics