Abstract

Tibetan medicine has received wide acclaim for its unique diagnosis and treatment methods. The identification of Tibetan medicinal materials, which are a vital component of Tibetan medicine, is a key research area in this field. However, traditional deep learning-based visual neural networks face significant challenges in efficiently and accurately identifying Tibetan medicinal materials due to their large number, complex morphology, and the scarcity of public visual datasets. To address this issue, we constructed a computer vision dataset with 300 Tibetan medicinal materials and proposed a lightweight and efficient cross-dimensional attention mechanism, the Dual-Kernel Split Attention (DKSA) module, which can adaptively share parameters of the kernel in both spatial and channel dimensions. Based on the DKSA module, we achieve efficient unification of convolution and self-attention under the CNN architecture and develop a new lightweight backbone architecture, EDKSANet, to provide enhanced performance for various computer vision tasks. As compared to RedNet, the top-1 accuracy is improved by 1.2% on an ImageNet dataset, and a larger margin of +1.5 box AP for object detection and an improvement of +1.3 mask AP for instance segmentation on MS-COCO dataset are obtained. Moreover, EDKSANet achieved excellent classification performance on the Tibetan medicinal materials dataset, with an accuracy of up to 96.85%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call