Abstract

Crystallographic phasing strategies increasingly require the exploration and ranking of many hypotheses about the number, types and positions of atoms, molecules and/or molecular fragments in the unit cell, each with only a small chance of being correct. Accelerating this move has been improvements in phasing methods, which are now able to extract phase information from the placement of very small fragments of structure, from weak experimental phasing signal or from combinations of molecular replacement and experimental phasing information. Describing phasing in terms of a directed acyclic graph allows graph-management software to track and manage the path to structure solution. The crystallographic software supporting the graph data structure must be strictly modular so that nodes in the graph are efficiently generated by the encapsulated functionality. To this end, the development of new software, Phasertng, which uses directed acyclic graphs natively for input/output, has been initiated. In Phasertng, the codebase of Phaser has been rebuilt, with an emphasis on modularity, on scripting, on speed and on continuing algorithm development. As a first application of phasertng, its advantages are demonstrated in the context of phasertng.xtricorder, a tool to analyse and triage merged data in preparation for molecular replacement or experimental phasing. The description of the phasing strategy with directed acyclic graphs is a generalization that extends beyond the functionality of Phasertng, as it can incorporate results from bioinformatics and other crystallographic tools, and will facilitate multifaceted search strategies, dynamic ranking of alternative search pathways and the exploitation of machine learning to further improve phasing strategies.

Highlights

  • Our Phaser crystallographic software for phasing macromolecular crystal structures based on maximum likelihood and multivariate statistics (Bricogne, 1992, 1997; Read, 2001) has been an asset to the crystallographic community, having solved tens of thousands of macromolecular crystal structures in the Protein Data Bank (PDB; Burley et al, 2019)

  • The focus of our developments has been phasing by molecular replacement (MR; Huber, 1965; Read, 2001) and single-wavelength anomalous dispersion (SAD; Hendrickson & Teeter, 1981; Pannu & Read, 2004) because these methods are similar in having the relative ease of requiring only a single data set, because they are both amenable to rigorous likelihood treatments and because single-wavelength data collection can require a lower total radiation dose than multiple-wavelength methods

  • As an example of the functionality of Phasertng, we describe the implementation of phasertng.xtricorder, which is a dataanalysis and preparation tool that provides some functionality overlapping with phenix.xtriage (Zwart et al, 2005) and TRUNCATE in CCP4 (Winn et al, 2011)

Read more

Summary

Introduction

Our Phaser crystallographic software for phasing macromolecular crystal structures based on maximum likelihood and multivariate statistics (Bricogne, 1992, 1997; Read, 2001) has been an asset to the crystallographic community, having solved tens of thousands of macromolecular crystal structures in the Protein Data Bank (PDB; Burley et al, 2019). When only very poor templates are available, CASP13 showed that the best homology models are better than the best template or even the best ensemble from PDB entries (Wallner, 2020; Croll et al, 2019) Contributing to these improvements has been the incorporation of evolutionary-covariance information in the modelling process (Simkovic et al, 2016). The tree-search-with-pruning strategy for MR and SAD in Phaser (McCoy et al, 2007) makes effective use of the strength of the maximum-likelihood functions in using prior information in the search for additional components in the asymmetric unit: either MR models or anomalously scattering atoms. Other examples of scenarios where nodes are combined from two parents include the validation of MR model placements with independently determined SAD substructures, or where placed MR components are substituted with homologous components and rescored to find the best components for phasing

Development of Phasertng
DAG modularity
Scripting
Improved algorithms
McCoy et al Phasertng
Results
Discussion
Funding information

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.