T3SRS: Tensor Train Transformer for compressing sequential recommender systems

Hao Li,Jianli Zhao,Huan Huo,Sheng Fang,Jianjian Chen,Lutong Yao,Yiran Hua

doi:10.1016/j.eswa.2023.122260

Abstract

In recent years, attention mechanisms have gained popularity in sequential recommender systems (SRSs) due to obtaining dynamic user preferences efficiently. However, over-parameterization of these models often increases the risk of overfitting. To address this challenge, we propose a Transformer model based on tensor train networks. Initially, we propose a tensor train layer (TTL) to accommodate the original weight matrix, thus reducing the space complexity of the mapping layer. Based on the TTL, we reconfigure the multi-head attention module and the position-wise feed-forward network. Finally, a tensor train layer replaces the output layer to complete the overall compression. According to the experimental results, the proposed model compresses SRSs parameters effectively, achieving compression rates of 76.2%−85.0%, while maintaining or enhancing sequence recommendation performance. To our knowledge, the Tensor Train Transformer is the first model compression approach for Transformer-based SRSs, and the model is broadly applicable.

Full Text