Abstract

Knowledge about RNA three-dimensional structure is essential to RNA function comprehension and manipulation. Due to difficulties associated with physical RNA structure elucidation techniques, such as x-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy, it is not surprising that predictive methods are increasingly gaining popularity. Consequent to many genome sequencing projects, many novel RNAs of unknown structure and function are discovered, generating considerable efforts in three-dimensional structure determination by physical methods and in gathering of low and medium resolution structural data, such as those obtained from enzymatic and chemical probing, chemical modifications, sequence analysis and in vitro selection. RNA three-dimensional structure prediction methods are, thus, needed. RNA three-dimensional structure predictions are logical consequences of structural knowledge and data. A prediction summarizes and condenses a large amount of experimental and theoretical data into a more comprehensible format. Predictions must be verified experimentally, and, thus, could be considered as good indicators for the planning of laboratory experiments. Particularly interesting are enzymatic activity data, such as those derived from in vitro selection and chemical modification experiments, which allow one to decipher active, as compared to ground state, conformations. Activity data are usually fuzzier and of lower resolution than x-ray crystallography and NMR spectroscopy data, but when used in combination with appropriate RNA three-dimensional structure determination methods could prove extremely informative and predictive. RNA three-dimensional structure determination methods employ techniques such as distance geometry, molecular mechanics, simulated annealing, interactive computer graphics and other constraint satisfaction methods. Several programs based on these techniques are productive and used in specific application field, to determine RNA three-dimensional structures from x-ray crystallography, NMR spectroscopy or low resolution data. It is expected that the next generation of RNA three-dimensional structure determination programs will allow us to enter descriptions of structural data (declarative), and to produce all associated consistent three-dimensional structures (sound and complete). Intuitively, this work could be justified, and motivated, from the fact that interactive computer modeling successfully suggests ways to explore efficiently the conformational space of RNAs, and is producing high precision structures. The resulting programs will automatically select the appropriate method, or combination of methods, according to the nature of the input data. The implementation of such programs could be simplified if a unified model of RNA structural knowledge and data is established. However, since it is computationally hard to address the theoretical flexibility of RNA three-dimensional structures, appropriate conceptualizations and approximations are necessary, implying that completeness and soundness in the biological sense might never be achieved. The development of RNA three-dimensional structure determination methods requires three essential components. First is a computer representation, or a data structure, of RNA three-dimensional structural knowledge and data. Second is an RNA conformational search space that includes three-dimensional structures consistent with the computer representation. The implementation of a conformational search space includes the following tasks: i) the creation of a set of operators to manipulate the RNA three-dimensional structures; ii) the definition of a metric to evaluate RNA three-dimensional structures; iii) the design of an efficient method for applying the chosen metric; and, iv) the design of an efficient method for generating the next three-dimensional structure to consider. Third is an inference engine which searches the conformational search space for three-dimensional structures that fit input descriptions. This article presents the most recent development of the MCSYM research project at the Universite de Montreal. MC-SYM is a joint effort between the computer science and biochemistry departments to develop computational methods in RNA threedimensional structure determination. A first MC-SYM prototype was reported and released in 1991. Since then, the program has been extensively tested and used to determine RNA threedimensional structures from the use of many different types and sources of structural data. Table 1 shows a list of the main publications in which the use of MC-SYM was important. The program is available by anonymous FTP and on the WEB. The main body of this article is divided into four different sections. In section 2, a computer representation of RNA structural data based on graph theory is introduced. In section 3, an RNA three-dimensional conformational search space is developed from the creation of operators defined from nitrogen base spatial relations (base pairing and base stacking) and rigid nucleotide conformations. In section 4, an RNA three-dimensional structure inference engine based on a backtracking algorithm, which is sound and complete over the conformational search space introduced in section 3. The necessary steps to determine the three-dimensional structure of a GAGA tetraloop from low resolution NMR spectroscopy data are presented to illustrate how one can use the computer technology presented in this article. Finally, in section 5, the three-dimensional structure determination of two small RNAs from low resolution NMR spectroscopy data are presented. The first is the loop 785-797 in 16S ribosomal RNA. The second is the 16S ribosomal RNA symmetrical motif of tandem G U mismatches.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call