Conspectus The desire to study molecular systems that are much larger than what the current state-of-the-art ab initio or density functional theory methods could handle has naturally led to the development of novel approximate methods, including semiempirical approaches, reduced-scaling methods, and fragmentation methods. The major computational limitation of ab initio methods is the scaling problem, because the cost of ab initio calculation scales nth power or worse with system size. In the past decade, the fragmentation approach based on chemical locality has opened a new door for developing linear-scaling quantum mechanical (QM) methods for large systems and for applications to large molecular systems such as biomolecules. The fragmentation approach is highly attractive from a computational standpoint. First, the ab initio calculation of individual fragments can be conducted almost independently, which makes it suitable for massively parallel computations. Second, the electron properties, such as density and energy, are typically combined in a linear fashion to reproduce those for the entire molecular system, which makes the overall computation scale linearly with the size of the system. In this Account, two fragmentation methods and their applications to macromolecules are described. They are the electrostatically embedded generalized molecular fractionation with conjugate caps (EE-GMFCC) method and the automated fragmentation quantum mechanics/molecular mechanics (AF-QM/MM) approach. The EE-GMFCC method is developed from the MFCC approach, which was initially used to obtain accurate protein-ligand QM interaction energies. The main idea of the MFCC approach is that a pair of conjugate caps (concaps) is inserted at the location where the subsystem is divided by cutting the chemical bond. In addition, the pair of concaps is fused to form molecular species such that the overcounted effect from added concaps can be properly removed. By introducing the electrostatic embedding field in each fragment calculation and two-body interaction energy correction on top of the MFCC approach, the EE-GMFCC method is capable of accurately reproducing the QM molecular properties (such as the dipole moment, electron density, and electrostatic potential), the total energy, and the electrostatic solvation energy from full system calculations for proteins. On the other hand, the AF-QM/MM method was used for the efficient QM calculation of protein nuclear magnetic resonance (NMR) parameters, including the chemical shift, chemical shift anisotropy tensor, and spin-spin coupling constant. In the AF-QM/MM approach, each amino acid and all the residues in its vicinity are automatically assigned as the QM region through a distance cutoff for each residue-centric QM/MM calculation. Local chemical properties of the central residue can be obtained from individual QM/MM calculations. The AF-QM/MM approach precisely reproduces the NMR chemical shifts of proteins in the gas phase from full system QM calculations. Furthermore, via the incorporation of implicit and explicit solvent models, the protein NMR chemical shifts calculated by the AF-QM/MM method are in excellent agreement with experimental values. The applications of the AF-QM/MM method may also be extended to more general biological systems such as DNA/RNA and protein-ligand complexes.
Read full abstract