A Web Resource for Standardized Benchmark Datasets, Metrics, and Rosetta Protocols for Macromolecular Modeling and Design.

Shane Ó Conchúir,Kale Kundert,Noah Ollikainen,Tanja Kortemme,Matthew J O'Meara,Kyle A Barlow,Roland A Pache,Colin A Smith

doi:10.1371/journal.pone.0130433

Shane Ó Conchúir, Kale Kundert + Show 6 more

Open Access

https://doi.org/10.1371/journal.pone.0130433

Copy DOI

Journal: PloS one	Publication Date: Sep 3, 2015
Citations: 87	License type: CC BY 4.0

Affiliation: QB3, University of California, San Francisco

Abstract

The development and validation of computational macromolecular modeling and design methods depend on suitable benchmark datasets and informative metrics for comparing protocols. In addition, if a method is intended to be adopted broadly in diverse biological applications, there needs to be information on appropriate parameters for each protocol, as well as metrics describing the expected accuracy compared to experimental data. In certain disciplines, there exist established benchmarks and public resources where experts in a particular methodology are encouraged to supply their most efficient implementation of each particular benchmark. We aim to provide such a resource for protocols in macromolecular modeling and design. We present a freely accessible web resource (https://kortemmelab.ucsf.edu/benchmarks) to guide the development of protocols for protein modeling and design. The site provides benchmark datasets and metrics to compare the performance of a variety of modeling protocols using different computational sampling methods and energy functions, providing a “best practice” set of parameters for each method. Each benchmark has an associated downloadable benchmark capture archive containing the input files, analysis scripts, and tutorials for running the benchmark. The captures may be run with any suitable modeling method; we supply command lines for running the benchmarks using the Rosetta software suite. We have compiled initial benchmarks for the resource spanning three key areas: prediction of energetic effects of mutations, protein design, and protein structure prediction, each with associated state-of-the-art modeling protocols. With the help of the wider macromolecular modeling community, we hope to expand the variety of benchmarks included on the website and continue to evaluate new iterations of current methods as they become available.

Highlights

Structure-based modeling and design of biological macromolecules have become rich areas of computational research and method development [1,2,3,4,5]
The associated publication of a new method may not contain a description of the dataset or statistical analysis in a format that is readily usable for developers of alternate methods, creating additional obstacles for a direct comparison. Organizations such as CASP [11] and CAPRI [12] create blind prediction tests for problems in protein structure prediction, protein-protein docking, and other applications, but many questions in the field of macromolecular modeling and design could benefit from canonical benchmarks such as those that exist for protein-protein docking [10,13]
We have presented our implementation of a benchmarking and protocol capture web resource which currently describes five diverse benchmarks and their expected performance when tested using a known best-practice methods from the Rosetta software suite

Summary

Introduction

Structure-based modeling and design of biological macromolecules have become rich areas of computational research and method development [1,2,3,4,5]. Further widespread adoption of the method requires more extensive validation: demonstrated success and careful evaluation of key limitations on multiple, diverse, test cases This general utility can be shown through the use of a suitable benchmark set. The associated publication of a new method may not contain a description of the dataset or statistical analysis in a format that is readily usable for developers of alternate methods, creating additional obstacles for a direct comparison Organizations such as CASP [11] and CAPRI [12] create blind prediction tests for problems in protein structure prediction, protein-protein docking, and other applications, but many questions in the field of macromolecular modeling and design could benefit from canonical benchmarks such as those that exist for protein-protein docking [10,13]. Iterative development, it is convenient to make benchmarks available for retrospective testing ( it is essential to pay attention to issues of overfitting to a particular target problem, even for large and diverse datasets)

Objectives

Methods

Findings

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Web Resource for Standardized Benchmark Datasets, Metrics, and Rosetta Protocols for Macromolecular Modeling and Design.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PloS one

Lead the way for us

Similar Papers

Centenary Award and Sir Frederick Gowland Hopkins Memorial Lecture. Protein folding, structure prediction and design.
David Baker
Biochemical Society transactions | VOL. 42
David BakerDavid Baker
20 Mar 2014
Biochemical Society transactions | VOL. 42

Protein Structure Prediction and Design in a Biologically Realistic Implicit Membrane
Rebecca F Alford ... Jeffrey J Gray
Biophysical Journal | VOL. 118
Rebecca F Alford, et. al.Rebecca F Alford ... Jeffrey J Gray
14 Mar 2020
Biophysical Journal | VOL. 118

Deep learning techniques have significantly impacted protein structure prediction and protein design
Robin Pearce ... Yang Zhang
Current opinion in structural biology | VOL. 68
Robin Pearce, et. al.Robin Pearce ... Yang Zhang
24 Feb 2021
Current opinion in structural biology | VOL. 68

Energy Profile Bayes and Thompson Optimized Convolutional Neural Network protein structure prediction.
Varanavasi Nallasamy ... Malarvizhi Seshiah
Neural Computing and Applications | VOL. 35
Varanavasi Nallasamy, et. al.Varanavasi Nallasamy ... Malarvizhi Seshiah
07 Oct 2022
Neural Computing and Applications | VOL. 35

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Web Resource for Standardized Benchmark Datasets, Metrics, and Rosetta Protocols for Macromolecular Modeling and Design.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PloS one