Deep learning based action recognition has become ubiquitous in the video analysis area; however, large neural networks require enormous computations to achieve high performance, which hinder them from mobile applications that are tightly constrained by hardware resources. In this work, we introduce a highly compact and fast neural network based Action Recognition Accelerator named ARA on the terminal device. We build an LSTM based spatio-temporal action recognition model with extracted time-series features from RGB frames and flow features from optical flow fields. Then the LSTM based spatio-temporal model is deeply compressed with tensor decomposition to further reduce redundant parameters and lessen computation overhead. Based on the datasets UCF-11, UCF-101, and HMDB51, our proposed method achieves 95.87%, 94.08%, and 75.71% classification accuracy, being comparable with other state-of-the-art methods. In particular, our proposed method significantly compresses the parameter of the LSTM model 215× on the UCF-101 dataset. The proposed system can also achieve a fast running speed of 157.7 FPS on GPU. Furthermore, we validate the performance of the proposed system on an ARM-based terminal device; the results show it only has 0.017s latency and 4.73W power consumption.