This paper propose a significantly enhanced YOLOv8 model specifically designed for detecting tongue fissures and teeth marks in Traditional Chinese Medicine (TCM) diagnostic images. By integrating the C2f_DCNv3 module, which incorporates Deformable Convolutions (DCN), replace the original C2f module, enabling the model to exhibit exceptional adaptability to intricate and irregular features, such as fine fissures and teeth marks. Furthermore, the introduction of the Squeeze-and-Excitation (SE) attention mechanism optimizes feature weighting, allowing the model to focus more accurately on key regions of the image, even in the presence of complex backgrounds. The proposed model demonstrates a significant performance improvement, achieving an average precision (mAP) of 92.77%, which marks a substantial enhancement over the original YOLOv8. Additionally, the model reduces computational cost by approximately one-third in terms of FLOPS, maintaining high accuracy while greatly decreasing the number of parameters, thus offering a more robust and resource-efficient solution. For tongue crack detection, the mAP increases to 91.34%, with notable improvements in F1 score, precision, and recall. Teeth mark detection also sees a significant boost, achieving an mAP of 94.21%. These advancements underscore the model’s outstanding performance in TCM tongue image analysis, providing a more accurate, efficient, and reliable tool for clinical diagnostic applications.
Read full abstract