Multi-scale hybrid attention aggregation networks for multi-modal monitoring in laser-induced thermal-crack processing

Chunyang Zhao,Jiayan Sun,Jingyi Fang,Xurui Li,Feifan Zhao,Jianguo Lei

doi:10.1016/j.ymssp.2024.111883

Abstract

In recent years, the rapid development of deep learning has led to an extensive amount of research into machining condition monitoring technologies. However, the complexity of multi-modal signals present in actual machining scenarios poses a significant challenge to signal fusion, and most models find it difficult to balance the relationship between performance and deployment cost. In this research, a multi-scale hybrid attention aggregation network (MHAAN) is proposed for multi-modal monitoring in laser-induced thermal-crack processing. Based on the feature distribution of visual signals and acoustic emission (AE) signals, the multi spatial scale aggregation module (MSSAM) and Multi frequency domain aggregation module (MFDAM) are designed. A cross-modal dot-product interaction module is employed to facilitate multilevel interactive fusion of the extracted feature vectors. This module enables feature transfer across different modalities, and then extracts the multi-model fusion feature. The practicality of MHAAN has been validated using a dataset obtained from the process of glass cutting with laser-induced thermal-crack propagation (LITP). Compared with commonly used models, the superiority of MHAAN in multi-modal feature extraction and fusion is demonstrated. Finally, the effectiveness of the module design based on prior knowledge is proven through ablation experiments, improving the interpretability of the model.

Full Text