Abstract

Medical images such as facial and tongue images have been widely used for intelligence-assisted diagnosis, which can be regarded as the multi-label classification task for disease location (DL) and disease nature (DN) of biomedical images. Compared with complicated convolutional neural networks and Transformers for this task, recent MLP-like architectures are not only simple and less computationally expensive, but also have stronger generalization capabilities. However, MLP-like models require better input features from the image. Thus, this study proposes a novel convolution complex transformation MLP-like (CCT-MLP) model for the multi-label DL and DN recognition task for facial and tongue images. Notably, the convolutional Tokenizer and multiple convolutional layers are first used to extract the better shallow features from input biomedical images to make up for the loss of spatial information obtained by the simple MLP structure. Subsequently, the Channel-MLP architecture with complex transformations is used to extract deep-level contextual features. In this way, multi-channel features are extracted and mixed to perform the multi-label classification of the input biomedical images. Experimental results on our constructed multi-label facial and tongue image datasets demonstrate that our method outperforms existing methods in terms of both accuracy (Acc) and mean average precision (mAP).

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.