Abstract

Deep Learning models need massive amounts compute powers and tend to improve performance running on special purpose processors accelerators designed to speed up compute-intensive applications. The accelerators like Tensor Processing Units (TPUs) and Graphics Processing Units (GPUs) are widely used as deep learning hardware platforms which can often achieve better performance than CPUs, with their massive parallel execution resources and high memory bandwidth. Google Colaboratory known as Colab is a cloud service based on Jupyter Notebook that allows the users to write and execute mostly Python in a browser and admits free access to TPUs and GPUs without extra configuration need, which are widely available cloud hardware platforms. In this paper, we present a through comparison of the hardware platforms on Google Colab that is benchmarked with Distributed Bidirectional Long Short-Term Memory (dBLSTM) models upon the number of layers, the number of units each layer, and the numbers of input and output units the datasets. Human Activity Recognition (HAR) data from UCI machine-learning library have been applied to the proposed distributed bidirectional LSTM model to find the performance, strengths, bottlenecks of the hardware platforms of TPU, GPU and CPU upon hyperparameters, execution time, and evaluation metrics: accuracy, precision, recall and F1 score.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call