Abstract

DNA mismatches, that is, base pairs different from the canonical AT and CG, are involved in numerous biological processes and can be a problem for technological applications such as PCR amplification. The nearest-neighbour (NN) model is the standard approach for predicting melting temperatures and is used in methods of secondary structure predictions and modelling of hybridization kinetics. However, despite its biological and technological importance, existing NN parameters that include DNA mismatches are incomplete, and those available were obtained from a limited set of melting temperature at high sodium concentration. To our knowledge, there is currently no NN set of parameters for up to three mismatches covering all configurations at low sodium concentrations. Here, we are applying the NN model to a large set of 4096 published melting temperatures, covering all combinations of single, double and triple mismatches. Dealing with such a large set of temperature is challenging in several ways, bringing new methodological problems. Here, optimizing a large number of 252 independent parameters has required the development of a new method where we readjust the seed parameters using the definition of the Gibbs free energy. The new parameters predict the training set within 1.1 °C and the validation set to 2.7 °C.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call