Abstract

Classical molecular dynamics (MD) techniques offer atomic-level insights into a wide range of physical systems, including biomolecular systems. They rely on empirical, parametrised equations (force fields) to describe the interaction potential between the constituents of the systems. Their predictive ability is critically dependent on the accuracy of these parameters. Highly optimised, well-validated parameters have been developed for many important biomolecules such as proteins, lipids and sugars. However, developing parameters for small molecules such as drugs is very challenging given the scale of chemical space. For instance, the ChEMBL database contains in excess of 1.7 million bioactive compounds. Although stand-alone software (e.g antechamber1,2) or web-servers (e.g. the Automated Topology Builder,3-5 atb.uq.edu.au) have been developed to assign parameters to drug-like molecules, they usually rely on fitting to quantum-mechanical (QM) calculations in combination with sets of empirical rules, and their applicability is limited to small molecules (less than a few tens of atoms). Automated parametrisation of large, bio-active, and often biologically relevant molecules is still impractical and greatly limits the predictive power of simulations involving such compounds.This thesis focuses on the use of graph-based approaches to develop new automated parametrisation paradigms which can exploit large data sets to simultaneously develop and assign force field parameters.An efficient Linear Programming method for deducing the charge state and bond order, based only on the molecular connectivity and chemical elements of its atoms, was developed. The algorithm was validated against the MMFF94 dataset containing 761 molecules with manually assigned bond orders. The approach was extended to i) cap molecular fragments (molecules having some atoms with incomplete valences) and ii) enumerate tautomeric (protomerism) forms of molecules. These extensions can be used for solving various problems related to drug-design and fragment-based empirical force-field parametrisation and can be applied to molecules containing hundreds of atoms.A new method for predicting chemically equivalent atoms in a molecule was developed. This method relies on identifying automorphisms of the molecule. False positives (non-chemically equivalent atoms treated as equivalent by the previous approach) were addressed by implementing a series of exceptions for double bonds, non-invertable rings and stereotopic atoms, respectively. This method was used to symmetrise force-field terms between chemically equivalent atoms in the Automated Topology Builder (ATB,3-5 atb.uq.edu.au).Building on these chemical equivalence relationships, a representation-independent, symmetry-corrected distance metric was developed (Blind RMSD). This facilitates the alignment of molecules with identical chemical graphs but different atom naming and indexing arbitrarily assigned. This approach was extended to fragments (i.e. subset of atoms in a molecules) for mapping graph nodes between two structures which may be equivalent in the 2D graph representation but not in a 3D model. How these procedures could be used to consolidate structural databases by identifying similar and differing conformational states is demonstrated.To address the problem of developing and assigning dihedral parameters in arbitrary molecules, a scheme for classifying dihedral terms around rotatable bonds based on the local substituents and their topologies (dihedral fragment) was developed. An analysis of the dihedral fragments contained in common biopolymers (DNA, RNA, proteins) and large chemical databases (PDB ligands, ChEMBL) revealed the extent of the diversity necessary to cover most of chemical space using this approach.A novel fitting technique (Fourier projections) that can fit scalar, 1-dimensional periodic functions up to an arbitrary precision using a series of sine and cosine functions with discrete (integer) frequencies was developed. I demonstrate how this approach can be used for the fitting of dihedral potentials based on quantum-mechanical (QM) data. The utility of the approach was demonstrated on the reparametrisation of the protein side chain dihedrals in the GROMOS54A7 force field, as well as on a large-scale automated reparametrisation of ∼1,200 dihedral fragments found in common drug-like molecules.Together, these tools and algorithms allowed the development of the Online tool for Fragment-based Molecule Parametrisation (OFraMP, oframp.atb.uq.edu.au). OFraMP is a web application that is designed to facilitate the parametrisation of atomic force fields for large (bio)molecules by matching sub-fragments within the target molecule to equivalent sub-fragments within a database of pre-existing parametrised molecules. Using two large and complex molecules (paclitaxel and a 290 atom-sized dendrimer) as examples, I illustrate how OFraMP can be used to semi-automatically assign Density Functional Theory (DFT)-derived point charges in a straight- forward and consistent manner.Many of the tools developed as part of this thesis are open-source and freely available to the scientific community under an MIT license.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call