Abstract

BackgroundR is the preferred tool for statistical analysis of many bioinformaticians due in part to the increasing number of freely available analytical methods. Such methods can be quickly reused and adapted to each particular experiment. However, in experiments where large amounts of data are generated, for example using high-throughput screening devices, the processing time required to analyze data is often quite long. A solution to reduce the processing time is the use of parallel computing technologies. Because R does not support parallel computations, several tools have been developed to enable such technologies. However, these tools require multiple modications to the way R programs are usually written or run. Although these tools can finally speed up the calculations, the time, skills and additional resources required to use them are an obstacle for most bioinformaticians.ResultsWe have designed and implemented an R add-on package, R/parallel, that extends R by adding user-friendly parallel computing capabilities. With R/parallel any bioinformatician can now easily automate the parallel execution of loops and benefit from the multicore processor power of today's desktop computers. Using a single and simple function, R/parallel can be integrated directly with other existing R packages. With no need to change the implemented algorithms, the processing time can be approximately reduced N-fold, N being the number of available processor cores.ConclusionR/parallel saves bioinformaticians time in their daily tasks of analyzing experimental data. It achieves this objective on two fronts: first, by reducing development time of parallel programs by avoiding reimplementation of existing methods and second, by reducing processing time by speeding up computations on current desktop computers. Future work is focused on extending the envelope of R/parallel by interconnecting and aggregating the power of several computers, both existing office computers and computing clusters.

Highlights

  • R is the preferred tool for statistical analysis of many bioinformaticians due in part to the increasing number of freely available analytical methods

  • The results demonstrate that R/parallel efficiently increases the performance of R when running parallel computations in current multicore processor desktop computer

  • Bioinformaticians are able to approach reducing the processing time of a growing number of analytical methods by N-fold, N being the number of present cores in their computers

Read more

Summary

Results

The execution time when processing 37685 traits from 73 individuals is reduced, using a quad-core processor, from approximate 4 hours to 1 hour. Another advantage of R/parallel is that it can be used in batch mode as well as in interactive mode. With R/parallel, partitioning is applied to loops and data, and multi-processing is used to get access to all the available processing units (i.e. cores in current desktop processors). By giving up a small percentage of the processor, it is possible to keep using the computer for other tasks, while the ongoing calculation only takes slightly more time

Conclusion
Background
C KEEP WORKING : Control overloading while running calculations
Trelles O
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call