As consumer electronics evolve towards greater intelligence, their automation and complexity also increase, making it difficult for users to diagnose faults when they occur. To address the problem where users, relying solely on their own knowledge, struggle to diagnose faults in consumer electronics promptly and accurately, we propose a multimodal knowledge graph-based text generation method. Our method begins by using deep learning models like the Residual Network (ResNet) and Bidirectional Encoder Representations from Transformers (BERT) to extract features from user-provided fault information, which can include images, text, audio, and even olfactory data. These multimodal features are then combined to form a comprehensive representation. The fused features are fed into a graph convolutional network (GCN) for fault inference, identifying potential fault nodes in the electronics. These fault nodes are subsequently fed into a pre-constructed knowledge graph to determine the final diagnosis. Finally, this information is processed through the Bias-term Fine-tuning (BitFit) enhanced Chinese Pre-trained Transformer (CPT) model, which generates the final fault diagnosis text for the user. The experimental results show that our proposed method achieves a 4.4% improvement over baseline methods, reaching a fault diagnosis accuracy of 98.4%. Our approach effectively leverages multimodal fault information, addressing the challenges users face in diagnosing faults through the integration of graph convolutional network and knowledge graph technologies.
Read full abstract