To run proper Big Data Analytics, small and medium enterprises (SMEs) need to acquire expertise, hardware and software, which often translates to relevant initial investments for activities not directly connected to the company's business. To reduce such investments, the TOREADOR project proposes a Big Data Analytics framework which supports users in devising their own Big Data solutions by keeping the inherent costs at a minimum, and leveraging pre-existent knowledge and expertise. Among the objectives of the TOREADOR framework is supporting developers in parallelizing and deploying their Big Data algorithms, in order to develop their own analytics solutions. This paper describes the Code-Based approach, adopted within the TOREADOR framework to parallelize users’ algorithms and deploy them on distributed platforms, via the annotation of parallelizable code portions with parallelization primitives. The approach, which relies on the guidance of Parallel Patterns to implement the parallelization, and on Skeletons to automatically build execution and deployment templates, is realized through a source-to-source Compiler, also described in the present paper.
Read full abstract