The determination of a three-dimensional protein structure in solution from twodimensional NMR NOE distance data has been attempted in recent years by a variety of computational methods. In particular, the distance geometry algorithm (1, 2) has been extensively used (3~2-3 d), as well as a minimization of dihedral angles only, using a variable target function (4)) simulated annealing (5)) restrained molecular dynamics (6)) systematic conformational search ( 7-9)) and Monte Carlo simulations (IO). However, problems arise in all of these approaches. Recently, there have been increased efforts in defining the limitations and computational characteristics of the algorithms. Results on BPTI and on several peptides ( 11) indicate that some implementations of the distance geometry technique can generate incorrect extended structures, and sample only a small region of conformational space. Moreover, recent distance geometry calculations of Neutrophil peptide 5 (12) further reveal the sampling limitations of the algorithms used. We have recently developed a different methodology for protein structure determination, namely the double-iterated Kalman filter (DIKF) technique ( 13, 14), which addresses some of these problems by quantifying an upper bound on the set of atomic positions that are compatible with the data. This method enables both a structure determination and a definitive estimate of its uncertainty, and thus provides significant additional knowledge and insight on those regions of the protein which correspond to nonunique conformations, e.g., for the Zuc repressor headpiece (15). In general, it is of importance that any study of a protein solution structure from NMR NOE data is accompanied by an investigation of the dependence of the final structure on the quality of the input distance data. This consists primarily of a controlled application of the technique to a small molecule of a known structure with data sets of varying quantity and quality. The variables tested are: ( 1) abundance of the input data; (2) precision of the input data; and (3) the computation time, or the number of double-iterated cycles. The precision and accuracy of the results are recorded as a function of these variables. It is the purpose of this Note to offer a validation of the DIKF algorithm for protein structure determination by an application to the crystal structure of oxytocin. The approach has already proven successful in an NMR solution structure determination of a larger protein, i.e., the lac repressor headpiece (13-1.5),
Read full abstract