Error Action Recognition on Playing The Erhu Musical Instrument Using Hybrid Classification Method with 3D-CNN and LSTM

Aditya Permana,Timothy K Shih,Anny Kartika Sari,Aina Musdholifah

doi:10.22146/ijccs.76555

Abstract

Erhu is a stringed instrument originating from China. In playing this instrument, there are rules on how to position the player's body and hold the instrument correctly. Therefore, a system is needed that can detect every movement of the Erhu player. This study will discuss action recognition on video using the 3DCNN and LSTM methods. The 3D Convolutional Neural Network method is a method that has a CNN base. To improve the ability to capture every information stored in every movement, combining an LSTM layer in the 3D-CNN model is necessary. LSTM is capable of handling the vanishing gradient problem faced by RNN. This research uses RGB video as a dataset, and there are three main parts in preprocessing and feature extraction. The three main parts are the body, erhu pole, and bow. To perform preprocessing and feature extraction, this study uses a body landmark to perform preprocessing and feature extraction on the body segment. In contrast, the erhu and bow segments use the Hough Lines algorithm. Furthermore, for the classification process, we propose two algorithms, namely, traditional algorithm and deep learning algorithm. These two-classification algorithms will produce an error message output from every movement of the erhu player.

Full Text