Abstract

Single nucleotide variants (SNVs) occurring in a protein coding gene may disrupt its function in multiple ways. Predicting this disruption has been recognized as an important problem in bioinformatics research. Many tools, hereafter p-tools, have been designed to perform these predictions and many of them are now of common use in scientific research, even in clinical applications. This highlights the importance of understanding the semantics of their outputs. To shed light on this issue, two questions are formulated, (i) do p-tools provide similar predictions? (inner consistency), and (ii) are these predictions consistent with the literature? (outer consistency). To answer these, six p-tools are evaluated with exhaustive SNV datasets from the BRCA1 gene. Two indices, called and , are proposed to quantify the inner consistency of pairs of p-tools while the outer consistency is quantified by standard information retrieval metrics. While the inner consistency analysis reveals that most of the p-tools are not consistent with each other, the outer consistency analysis reveals they are characterized by a low prediction performance. Although this result highlights the need of improving the prediction performance of individual p-tools, the inner consistency results pave the way to the systematic design of truly diverse ensembles of p-tools that can overcome the limitations of individual members.

Highlights

  • To fulfill its biological function under specific environmental conditions, such as the cellular milieu, each protein must be folded into a defined three-dimensional structure, known as its native structure

  • Inner consistency measurements accomplished by means of the Kall and Kstrong indices are shown in Tables 1 and 2, respectively

  • Kstrong achieve larger values than Kall; this is reasonable as Kstrong only considers opposite preference relationships

Read more

Summary

Introduction

To fulfill its biological function under specific environmental conditions, such as the cellular milieu, each protein must be folded into a defined three-dimensional structure, known as its native structure. Structural modifications of proteins may result in partial or total loss of function, as in the case of cystic fibrosis disease [1,2]. A change in an individual nucleotide ( known as a single nucleotide variant or SNV) in a protein coding gene may lead to an amino acid change. In this case, the SNV involves a non-synonymous substitution, called a missense mutation. A SNV may produce a premature stop codon leading to protein truncation, in what is known as a nonsense mutation

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.