Tooth-Marked Tongue Recognition Using Gradient-Weighted Class Activation Maps

Yue Sun,Yin Zhang,Songmin Dai,Xiaoqiang Li,Jide Li

doi:10.3390/fi11020045

Yue Sun, Yin Zhang + Show 3 more

Open Access

https://doi.org/10.3390/fi11020045

Copy DOI

Abstract

The tooth-marked tongue is an important indicator in traditional Chinese medicinal diagnosis. However, the clinical competence of tongue diagnosis is determined by the experience and knowledge of the practitioners. Due to the characteristics of different tongues, having many variations such as different colors and shapes, tooth-marked tongue recognition is challenging. Most existing methods focus on partial concave features and use specific threshold values to classify the tooth-marked tongue. They lose the overall tongue information and lack the ability to be generalized and interpretable. In this paper, we try to solve these problems by proposing a visual explanation method which takes the entire tongue image as an input and uses a convolutional neural network to extract features (instead of setting a fixed threshold artificially) then classifies the tongue and produces a coarse localization map highlighting tooth-marked regions using Gradient-weighted Class Activation Mapping. Experimental results demonstrate the effectiveness of the proposed method.

Highlights

Inspection of the tongue is one of the most important diagnostic methods in traditional Chinese medicine (TCM)
The appearances of the tooth-marked tongue are shown in Figure 1; (a) is a normal tongue image for reference. (b.1) and (b.2) are tooth-marked tongue images with teeth-marked regions shown in blue boxes
The recognition of tooth-marked tongues can be viewed as a fine-grained classification problem, but it is more challenging than distinguishing between subcategories due to some specific difficulties in the field of tongue diagnosis

Summary

Introduction

Inspection of the tongue is one of the most important diagnostic methods in traditional Chinese medicine (TCM). Existing approaches have a lack of decomposability into intuitive and understandable components, making tongue diagnoses hard to interpret These questions lead us to seek help from Gradient-weight Class Activation Mapping (Grad-CAM). Grad-CAM was proposed by Selvaraju et al [9] to provide visual explanations of the Convolutional Neural Network (CNN) It uses the gradients of any target concept, flowing into the final convolutional layer to produce a coarse localization map highlighting the important regions in the image for predicting the concept. We present a method that accurately classifies tooth-marked tongue and localizes the important regions in the image for predicting the pathology without bounding boxes. Through the visual interpretation of the tooth-mark problem, we explore the effect of different receptive field sizes on the classification results.

Methods

Findings

Discussion

Conclusion