Abstract
BackgroundRNA folding is an ongoing compute-intensive task of bioinformatics. Parallelization and improving code locality for this kind of algorithms is one of the most relevant areas in computational biology. Fortunately, RNA secondary structure approaches, such as Nussinov’s recurrence, involve mathematical operations over affine control loops whose iteration space can be represented by the polyhedral model. This allows us to apply powerful polyhedral compilation techniques based on the transitive closure of dependence graphs to generate parallel tiled code implementing Nussinov’s RNA folding. Such techniques are within the iteration space slicing framework – the transitive dependences are applied to the statement instances of interest to produce valid tiles. The main problem at generating parallel tiled code is defining a proper tile size and tile dimension which impact parallelism degree and code locality.ResultsTo choose the best tile size and tile dimension, we first construct parallel parametric tiled code (parameters are variables defining tile size). With this purpose, we first generate two nonparametric tiled codes with different fixed tile sizes but with the same code structure and then derive a general affine model, which describes all integer factors available in expressions of those codes. Using this model and known integer factors present in the mentioned expressions (they define the left-hand side of the model), we find unknown integers in this model for each integer factor available in the same fixed tiled code position and replace in this code expressions, including integer factors, with those including parameters. Then we use this parallel parametric tiled code to implement the well-known tile size selection (TSS) technique, which allows us to discover in a given search space the best tile size and tile dimension maximizing target code performance.ConclusionsFor a given search space, the presented approach allows us to choose the best tile size and tile dimension in parallel tiled code implementing Nussinov’s RNA folding. Experimental results, received on modern Intel multi-core processors, demonstrate that this code outperforms known closely related implementations when the length of RNA strands is bigger than 2500.
Highlights
RNA folding is an ongoing compute-intensive task of bioinformatics
In paper [4], we presented loop tiling based on the transitive closure of a dependence graph for Nussinov’s algorithm
We present an approach allowing for deriving the best size of original tiles to be used for generation of iteration space slicing (ISS) based tiled code implementing Nussinov’s RNA folding
Summary
RNA folding is an ongoing compute-intensive task of bioinformatics. Parallelization and improving code locality for this kind of algorithms is one of the most relevant areas in computational biology. RNA secondary structure approaches, such as Nussinov’s recurrence, involve mathematical operations over affine control loops whose iteration space can be represented by the polyhedral model. This allows us to apply powerful polyhedral compilation techniques based on the transitive closure of dependence graphs to generate parallel tiled code implementing Nussinov’s RNA folding. Nussinov’s folding algorithm [1] uses the Nussinov’s algorithm is compute intensive due to a cubic time complexity It involves mathematical operations over affine control loops whose iteration space can be represented by the polyhedral model [2]. Let S be an N × N Nussinov matrix and σ (i, j) be a function which returns 1 if (xi, xj) match and i < j − 1, or 0
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have