Abstract Human emotion recognition has become an important research field. Because of its objectivity, physiological signals have become one of the most robust cues for emotion recognition. In recent years, deep learning methods have made great progress in the field of emotion recognition, especially the superiority of recurrent neural networks (RNN) in time series models, and more and more tasks are completed based on RNN. However, RNN has problems such as time-consuming and gradient disappearance and explosion, and the feature input and sentiment output of RNN are not aligned. To avoid these problems, this paper is based on the temporal convolutional networks (TCN) model and the connectionist temporal classification (CTC) algorithm to process the emotion recognition task. First, generate a spectrogram representation of the physiological signal in each channel; Second, use the TCN to learn long-term dynamic features, and use CTC to align the dynamic features and their sentiment labels, and then feed the learned deep features into the neural network to predict the sentiment of each channel; Finally, take the best result as the final emotion representation. Experimental results on the AMIGOS dataset show that the proposed method outperforms existing methods.