This special issue of Concurrency and Computation: Practice and Experience contains revised and extended versions of selected papers presented at the conference Euro-Par 2015. Euro-Par—the European Conference on Parallel Computing—is an annual series of international conferences dedicated to the promotion and advancement of all aspects of parallel and distributed computing. Euro-Par covers a wide spectrum of topics from algorithms and theory to software technology and hardware-related issues, with application areas ranging from scientific to mobile and cloud computing. The major part of the Euro-Par audience consists of researchers in academic institutions, government laboratories and industrial organisations. Euro-Par 2015, the 21st conference in the Euro-Par series, was held in Vienna, Austria. It was organised by the Research Group for Parallel Computing of the Vienna University of Technology (TU Wien). Thirteen broad topics were defined and advertised, covering a large variety of aspects of parallel and distributed computing. The call for papers attracted a total of 190 submissions. The submitted papers were reviewed at least three and, in most cases, four or even more times (four reviews on average). A total of 51 papers were finally accepted for publication. This makes a global acceptance rate of 27%. The authors of accepted papers came from 21 countries, with the four main contributing countries—the United States, France, Spain and Germany—accounting for a bit more than half of them. Based on the results of the reviews and a majority opinion of the respective topic programme committees, a number of papers were recommended for this special issue. The authors were contacted at the conference and invited to submit revised and extended versions of their papers. These new versions were reviewed independently by three reviewers; two had previously reviewed the conference version, the third had not. Eventually, four papers were accepted for publication. This year, two Euro-Par topics are represented—both covering methods of programming modern computer architectures. Topic 13 on Accelerator Computing is represented with three papers. The paper Performance optimization of sparse matrix-vector multiplication for multi-component PDE-based applications using GPUs, authored by Ahmad Abdelfattah, Hatem Ltaief, David Keyes and Jack Dongarra 1, describes the implementation of a single-GPU and multi-GPU kernel for block-sparse matrix-vector multiplication, a problem that appears in the discretisation of partial differential equations with many dependent variables. The performance of the kernel is measured on a subset of the Florida Sparse Matrix Collection. Especially noted by the reviewers was the uniform interface that applies to a wide range of problem sizes via tunable parameters. This makes it perform efficiently on a wide range of GPU architectures running CUDA. The paper Fast parallel skew and prefix-doubling suffix array construction on the GPU, authored by Leyuan Wang, Sean Baxter and John D. Owens 2, proposes a hybrid GPU implementation of known algorithms for constructing suffix arrays of a string that fits the given GPU architecture best. One highlight pointed out in the reviews is a highly efficient segmented sorting primitive, which is also valuable as independent result. The paper Performance and portability of accelerated lattice Boltzmann applications with OpenACC, authored by Enrico Calore, Jiri Kraus, Sebastiano Fabio Schifano and Raffaele Tripiccione 3, reports on a performance study based on a simple performance model of an OpenACC-based lattice Boltzmann implementation on three different architectures: an NVIDIA GPU, an AMD GPU and a multi-core CPU. The practical relevance of this work was particularly appreciated. Topic 09 on Multi- and Many-Core Programming is represented by the paper Continuous skyline queries on multicore architectures, authored by Tiziano De Matteis, Salvatore Di Girolamo and Gabriele Mencagli 4. Skyline queries result in all tuples whose attribute vector is not dominated (in the Pareto sense) by any other tuple. The authors provide a thorough description and experimental evaluation of a parallelization of the ‘eager’ algorithm for skyline computation of points in d-dimensional space in continuous data streams with expiration times defined by a sliding constant-sized time window. It was judged as a very nice case study of the use of FastFlow, a framework for high-level, pattern-based parallel programming in C++. Concluding this preface, we would like to thank Prof. Geoffrey Fox, editor-in-chief of Concurrency and Computation: Practice and Experience, for his support of this special issue. We would also like to thank our peers who assisted us in reviewing the papers and helped strengthen the final versions. Last, but not least, we also appreciate the support of Springer, who agreed to the publication of the extended versions of the articles that appeared originally in the Lecture Notes in Computer Science.
Read full abstract