Abstract

The parallel computing capabilities of GPUs have a significant impact on computationally intensive iterative tasks. Offloading part or all of the deep learning tasks from the CPU to the GPU for execution is mainstream. However, a large number of redundant iterative calculations exist in the iterative process of computing tasks. Therefore, we propose a GPU-based distributed incremental iterative computing architecture that can make full use of distributed parallel computing and GPU memory structure. The architecture supports deep learning and other computationally intensive iterative applications by optimizing data placement and reducing redundant iterative calculations. To support block-based data partitioning and coalesced memory access on GPUs, we propose GDataSet, an abstract data set. The GPU incremental iteration manager called GTracker is designed to be responsible for GDataSet cache management on the GPU. In order to solve the limitation of on-chip memory size, we propose a variable sliding window mechanism. It improves the hit rate of cache access and the speed of data access by realizing the best block arrangement between on-chip memory and off-chip memory. Besides, a communication channel based on an incremental iterative model is designed to support data transmission and task communication in cluster computing. Finally, we implement the proposed architecture based on Spark 2.4.1 and CUDA 10.0. Comparative experiments with widely used computationally intensive iterative applications (K-means, LSTM, etc.) show that the incremental iterative acceleration architecture can significantly improve the efficiency of iterative computing.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call