Abstract

In X-ray crystallography, validation tools assess the quality and the reliability of the structural models that crystallographers build and refine. These tools check both the consistency of physical, chemical and statistical properties of the model with the prior knowledge available in structural databases, and the agreement of the model with the diffraction data. B factors give important information about the spatial disorder of each atom around its rest position in a crystal, allowing one to infer the precision of atomic coordinates and dynamical properties of the macromolecule. The first part of the thesis work is focused on the development of a new validation tool for the distribution of isotropic B factors in crystallographic models. By means of a Bayesian approach the shifted Inverse-Gamma distribution (IGD*) is proposed as a reference distribution and a validation protocol is designed and developed to test this hypothesis. Starting from an empirical B factor distribution, the protocol returns the parameters estimates of the IGD* that best fits the B factor distribution and a p-value that is used to label the distribution as acceptable or suspicious. The protocol is then tested on a large data set of high-resolution protein structures from the PDB. From the distribution of the IGD* parameters it is possible to identify different groups of outliers, each characterized by peculiar features. Moreover, from the analysis of the distribution of p-values, the majority of the structures analysed have an acceptable B factor distribution and the agreement to the IGD* follows a hierarchical organization (whole asymmetric unit content, single chains and single domains). B factor distributions that do not satisfy the IGD* assumption usually correspond to models with problems with the deposited coordinates or diffraction data. In light of these results the developed protocol is proposed as an effective tool for the validation of B factor distributions in macromolecular crystallography. Furthermore, provided that the diffraction data are deposited in the PDB, a standard re-refinement protocol is confirmed to be a valid approach to rescue a B factor distribution from suspicious to acceptable, and to improve the quality of the results of the ensemble analysis performed with the ESCET framework if the starting data set contains models with suspicious B factor distributions. The validation protocol for B factor distributions finds a direct application in the second part of the thesis work, which is focused on the ensemble analysis with the ESCET framework of a selected data set of twenty-nine 30S ribosomal subunits from Thermus thermophilus. Thirteen refinement protocols are tested to improve, normalise and de-bias the selected structures, and to rescue models with suspicious B factor distributions. A comparative ensemble analysis is performed between the ribosomal models as deposited into the PDB and those obtained from the best refinement protocol in terms of refinement statistics and distribution of B factors. The cluster analysis is confirmed to be an effective method to automatically rationalise the structural information content of the data set. The observation that after re-refinement some structures moved to a different cluster confirms the existence of structural bias in the originally deposited structures and leads to the discovery of electron density that was not modelled in the deposited structure. Improvements of refinement statistics after re-refinement result in lower coordinate uncertainty estimates with positive effects on the results of the rigid body analysis. The main rigid bodies found on the 16S rRNA correspond to the domains known in the literature to move during the decoding process. Final remarks are given about the possible application of the presented validation tool for B factor distributions and about the importance of the availability of experimental data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.