Abstract

There are numerous errors in published crystal structures, and the prevalence of Ramachandran outliers can be a valuable tool in identifying these errors. However, it is necessary to know which outliers are valid in a PDB file and which ones result from errors in the X‐ray crystallographic model structure. To address this problem, the Python language and NumPy library were used to compile data on approximately eight thousand high‐quality X‐ray crystal structures of proteins published in a public database to extract relevant data to provide insight concerning the probability and validity of a Ramachandran outlier assignments. Ramachandran analyses were performed by Phenix using the Phenix/Python application programming interface. The data collected include each residue's identity, Ramachandran outlier evaluations, phi and psi angles, outlier score and type, and information regarding the location of the residue in the protein. The above information along with metadata about the protein crystal structures are stored and maintained in an HDF5 file. The results of the data analysis using machine learning and other statistical methods will be presented.This abstract is from the Experimental Biology 2018 Meeting. There is no full text article associated with this abstract published in The FASEB Journal.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call