Thangka Image Research Articles

Image–text matching is a research hotspot in the multimodal task of integrating image and text processing. In order to solve the difficult problem of associating image and text data in the multimodal knowledge graph of Thangka, we propose an image and text matching method based on the Visual Semantic Embedding (VSE) model. The method introduces an adaptive pooling layer to improve the feature extraction capability of semantic associations between Thangka images and texts. We also improved the traditional Transformer architecture by combining bidirectional residual concatenation and mask attention mechanisms to improve the stability of the matching process and the ability to extract semantic information. In addition, we designed a multi-granularity tag alignment module that maps global and local features of images and text into a common coding space, leveraging inter- and intra-modal semantic associations to improve image and text accuracy. Comparative experiments on the Thangka dataset show that our method achieves significant improvements compared to the VSE baseline method. Specifically, our method improves the recall by 9.4% and 10.5% for image-matching text and text-matching images, respectively. Furthermore, without any large-scale corpus pre-training, our method outperforms all models without pre-training and outperforms two out of four pre-trained models on the Flickr30k public dataset. Also, the execution efficiency of our model is an order of magnitude higher than that of the pre-trained models, which highlights the superior performance and efficiency of our model in the image–text matching task.

Read full abstract

Aiming at the problems of target detection network in the defect detection field of thangka images with complex background colors, such as poor small target detection effect, insufficient feature information extraction, prone to error detection and leak detection, and low accuracy of defect detection, this paper proposed the YOLOv5 defect detection algorithm combining attention mechanism and receptive field. First of all, the Backbone network is used for feature extraction, integrating attention mechanism to represent different features, so that the network can fully extract the texture and semantic features of the defect area, and the extracted features are weighted and fused to reduce information loss. Secondly, a weighted fusion of features of different dimensions is transferred by the Neck network, and the combination of FPN and PAN is used to realize the fusion of semantic features and texture features of different layers and to locate the defect target more accurately. Finally, while replacing the GIoU loss function with CIoU, the receptive field is added to the network, so that the algorithm uses a four-channel detection mechanism to expand the detection range of receptive fields, and fuses semantic information between different network layers, so as to achieve fast location and more refined processing of small targets. The experimental results show that compared with the original YOLOv5 network, the detection accuracy of YOLOV5-scSE and YOLOV5-CA networks proposed in this paper has improved by 9.96 percentage points and 12.22 percentage points respectively, and the verification index has been significantly improved. It can quickly and more accurately identify and locate the location of the defect area and has a stronger ability to generalize the defect category, which greatly improves the accuracy of thangka image defect detection.

Read full abstract

Thangka Image Research Articles

Articles published on Thangka Image

Thangka Image—Text Matching Based on Adaptive Pooling Layer and Improved Transformer

Thangka Image Captioning Based on Semantic Concept Prompt and Multimodal Feature Optimization.

Few Shot Object Detection for Tangka Seats Based on Deformable Convolution

Application of YOLOv5 Neural Network Based on Improved Attention Mechanism in Recognition of Thangka Image Defects

Thangka Image Segmentation Method Based on Enhanced Receptive Field

Application of YOLOv5 Based on Attention Mechanism and Receptive Field in Identifying Defects of Thangka Images

A Semantic Segmentation Model for Headdresses in Thangka Image Based on Line Drawing Augmentation and Spatial Prior Knowledge

Imbalanced Thangka Image Classification research Based on the ResNet Network

Few-shot Thangka image classification based on improved DenseNet

Bibliometric Analysis of the Research Status of Tangka Images at Home and Abroad

基于多特征的彩色唐卡修复图像无参考质量评价方法

Research on a Thangka Image Classification Method Based on Support Vector Machine

A new method of Thangka image inpainting quality assessment

A new quality assessment for Thangka image inpainting

Algorithm Optimization for the Edge Extraction of Thangka Images

Damaged region filling by improved criminisi image inpainting algorithm for thangka

Thangka Image Retrieval System Based on GLCM

Improved exemplar-based inpainting algorithm for broken Thangka images

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Thangka Image Research Articles

Articles published on Thangka Image

Thangka Image—Text Matching Based on Adaptive Pooling Layer and Improved Transformer

Thangka Image Captioning Based on Semantic Concept Prompt and Multimodal Feature Optimization.

Few Shot Object Detection for Tangka Seats Based on Deformable Convolution

Application of YOLOv5 Neural Network Based on Improved Attention Mechanism in Recognition of Thangka Image Defects

Thangka Image Segmentation Method Based on Enhanced Receptive Field

Application of YOLOv5 Based on Attention Mechanism and Receptive Field in Identifying Defects of Thangka Images

A Semantic Segmentation Model for Headdresses in Thangka Image Based on Line Drawing Augmentation and Spatial Prior Knowledge

Imbalanced Thangka Image Classification research Based on the ResNet Network

Few-shot Thangka image classification based on improved DenseNet

Bibliometric Analysis of the Research Status of Tangka Images at Home and Abroad

基于多特征的彩色唐卡修复图像无参考质量评价方法

Research on a Thangka Image Classification Method Based on Support Vector Machine

A new method of Thangka image inpainting quality assessment

A new quality assessment for Thangka image inpainting

Algorithm Optimization for the Edge Extraction of Thangka Images

Damaged region filling by improved criminisi image inpainting algorithm for thangka

Thangka Image Retrieval System Based on GLCM

Improved exemplar-based inpainting algorithm for broken Thangka images