CSformer: Enhancing deep learning efficiency for intelligent IoT

Xu Jia,Han Wu,Ruochen Zhang,Min Peng

doi:10.1016/j.comcom.2023.11.007

Abstract

The rapid development of deep learning technology has led to increasing demand for more intelligent, automated, and humanized Internet of Things (IoT) devices. Deep learning models, while endowing IoT devices with the capability to learn higher-level features, concurrently impose more demanding computational and storage prerequisites on the hardware. To tackle the challenge and enable the practical application of deep learning models in IoT devices, we propose a novel efficient Transformer called CSformer, which incorporates intra-layer cluster and inter-layer selection. Intra-layer cluster is performed using a k-means++ based generation algorithm to improve cluster accuracy. To address the issues of information loss caused by clustering, we propose cluster center information enhancement and clustering loss calculation modules. The inter-layer selection strategy selects tokens according to their contribution, consistently diminishes redundancy, and prioritizes the retention of crucial information. By consistently reducing the sequence length, the inter-layer selection significantly improves training speed and reduces the memory occupation of the model. The experimental results indicate that in two common scenarios for intelligent IoT, namely text classification and sequence labeling, CSformer significantly outperforms the baseline models. Specifically, in the text classification task, our model achieves an average 22.66% reduction in memory consumption, a 37.98% decrease in time consumption, and a superior 9.58% performance improvement compared to baseline models across six datasets. Additional experiments substantiate the efficacy of the intra-layer cluster and inter-layer selection modules, as demonstrated through ablation experiments, overall performance, and visualization. The intra-layer cluster module enhances the performance of existing models by achieving more precise clustering and mitigating information loss, leading to significant performance improvements. The inter-layer selection module enhances the efficiency of existing studies by reducing model memory consumption and improving computational efficiency through the selective retention of essential tokens. This can effectively facilitate future research in applying advanced deep learning models to intelligent IoT, expanding the range of application scenarios and tasks within the field of intelligent IoT.

Full Text