Abstract

Nowadays, the most heterogeneous architectures were made up by the various IP modules of different hardware vendors, but this model is less efficiently. In order to solve this problem, AMD joint other hardware vendors proposed heterogeneous system architecture (HSA) specification. On the one hand, the HSA could help developers to accelerate the design process and programming. On the other hand, it improved the system performance and reduced the power. In this paper we presented the implementation of a framework for accelerating training and classification of arbitrary Convolutional Neural Networks (CNNs) on the HSA, on the basis of implementation, we presented tow accelerated methods that are Online update weights and letting CPU to participate in calculation. Experimental results showed that the implementation of CNNs on HSA 4 to 10 times faster than on the CPU.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call