Image Division Research Articles

Vision Transformer (ViT) is widely used in the field of computer vision, in ViT, there are four main steps, which are “four secrets”, such as patch division, token selection, position encoding addition, attention calculation, the existing research on transformer in computer vision mainly focuses on the above four steps. Therefore, “how to divide patch?”, “how to select token?”, “how to add position encoding?”, and “how to calculate attention?” are crucial to improve ViT performance. But so far, most of the review literatures are summarized from the perspective of application, and there is no corresponding literature to comprehensively summarize these four steps from the technology perspective, which restricts the further development of ViT in some degree. To address the above questions, the 4 major mechanisms and 5 applications of ViT are summarized in this paper, the main innovative works are as follows: Firstly, the basic principle and model structure of ViT are elaborated; Secondly, aiming to “how to divide patch?”, the 5 key techniques of patch division mechanism are summarized: from single-size division to multi-size division, from fixed number division to adaptive number division, from non-overlapping division to overlapping division, from semantic segmentation division to semantic aggregation division, and from original image division to feature map division; Thirdly, aiming to “how to select token?”, the 3 key techniques of token selection mechanism are summarized: token selection based on score, token selection based on merge, token selection based on convolution and pooling; Fourthly, aiming to “how to add position encoding?”, the 5 key techniques of position encoding mechanism are summarized: absolute position encoding, relative position encoding, conditional position encoding, locally-enhanced position encoding, and zero-padding position encoding; Fifthly, aiming to “how to calculate attention?”, 18 attention mechanisms are summarized based on the timeline; Sixthly, these models that Transformer is combined with U-Net, GAN, YOLO, ResNet, and DenseNet are discussed in the medical image processing field; Finally, around these four questions proposed in this paper, we look forward to the future development direction of frontier technologies such as patch division mechanism, token selection mechanism, position encoding mechanism, and attention mechanism et al, which play an important role in the further development of ViT.

Rotator cuff tear (RCT) and biceps tendinosis (BT) are the two most common shoulder disorders worldwide. These disorders can be diagnosed using magnetic resonance imaging (MRI), but the expert interpretation is manual, time-consuming, and subjected to human errors. Therefore, a fixed-size feature extraction model was created to objectively and accurately perform automated binary classification of RCT vs. normal and BT vs. normal on MRI images. We have developed an exemplar deep feature extraction model to diagnose RCT and BT disorders. The model was tested on a new MR image dataset comprising transverse, sagittal, and coronal MRI images of the shoulder that had been organized into three cases. BT was studied on transverse MRI images (Case 1), while RCT was studied on sagittal (Case 2) and coronal MRI images (Case 3). Our model comprised deep feature generation using a pre-trained VGG19, feature selection using iterative neighborhood component analysis (INCA), and classification using shallow standard classifiers k-nearest neighbors (KNN), support vector machine (SVM), and artificial neural network (ANN). In the feature extraction phase, two fully connected layers were used to extract deep features from the original image, and sixteen fixed-size patches obtained by the division of the original image. This model was named Vision VGG19 (ViVGG), analogous to vision transformers (ViT). The feature vector is extracted from the raw image dataset, and 16 feature vectors are extracted from each fixed-size patch. Seventeen feature vectors obtained from each image are obtained from fc6 and fc7 layers of the pre-trained VGG19, are merged to obtain final feature vector. INCA was used to choose the top features from the created features, and the chosen features were classified using shallow classifiers. We defined three cases to evaluate the proposed ViVGG19 to diagnose RT and BCT disorders. Our proposed ViVGG19 model achieved more than 99% accuracy using the KNN classifier. ViVGG19 is a very effective model for detecting RCT and BT disorders on shoulder MRI images. The developed automated system is ready to be tested with a bigger diverse database obtained from different medical centers.

Image Division Research Articles

Related Topics

Articles published on Image Division

An improved infrared image post-processing method for metals and composites

Image Division Using Threshold Schemes with Privileges

Leukemia detection using Artificial Neural Networks in Images of Human Blood Sample

Vision transformer: To discover the “four secrets” of image patches

Enhancement of Old Historical Document by Image Processing from Gray scale to RGB Scale Conversion

A Double Clustering Approach for Color Image Segmentation

Linguistic Methods of Image Division for Visual Data Security

Event-based imaging polarimeter simulation with a single DoFP image.

Prediction of Uncertainty Estimation and Confidence Calibration Using Fully Convolutional Neural Network

ViVGG19: Novel exemplar deep feature extraction-based shoulder rotator cuff tear and biceps tendinosis detection using magnetic resonance images.

Flexible patch moving modes for pixel-value-ordering based reversible data hiding methods

SegNet-based left ventricular MRI segmentation for the diagnosis of cardiac hypertrophy and myocardial infarction

Towards Efficient Detection for Small Objects via Attention-Guided Detection Network and Data Augmentation.

Perspectives of medical students on future work-life balance in Japan: A qualitative study using postlecture comments.

Design of an optimized fuzzy system for edge detection in images

Polarized Intensity Ratio Constraint Demosaicing for the Division of a Focal-Plane Polarimetric Image

A Spatial–Spectral Combination Method for Hyperspectral Band Selection

Performance of Machine Learning and Image Processing in Plant Leaf Disease Detection

Joint uneven channel information network with blend metric loss for person re-identification

Design and Implementation of Image Edge Detection Algorithm on FPGA

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Image Division Research Articles

Related Topics

Articles published on Image Division

An improved infrared image post-processing method for metals and composites

Image Division Using Threshold Schemes with Privileges

Leukemia detection using Artificial Neural Networks in Images of Human Blood Sample

Vision transformer: To discover the “four secrets” of image patches

Enhancement of Old Historical Document by Image Processing from Gray scale to RGB Scale Conversion

A Double Clustering Approach for Color Image Segmentation

Linguistic Methods of Image Division for Visual Data Security

Event-based imaging polarimeter simulation with a single DoFP image.

Prediction of Uncertainty Estimation and Confidence Calibration Using Fully Convolutional Neural Network

ViVGG19: Novel exemplar deep feature extraction-based shoulder rotator cuff tear and biceps tendinosis detection using magnetic resonance images.

Flexible patch moving modes for pixel-value-ordering based reversible data hiding methods

SegNet-based left ventricular MRI segmentation for the diagnosis of cardiac hypertrophy and myocardial infarction

Towards Efficient Detection for Small Objects via Attention-Guided Detection Network and Data Augmentation.

Perspectives of medical students on future work-life balance in Japan: A qualitative study using postlecture comments.

Design of an optimized fuzzy system for edge detection in images

Polarized Intensity Ratio Constraint Demosaicing for the Division of a Focal-Plane Polarimetric Image

A Spatial–Spectral Combination Method for Hyperspectral Band Selection

Performance of Machine Learning and Image Processing in Plant Leaf Disease Detection

Joint uneven channel information network with blend metric loss for person re-identification

Design and Implementation of Image Edge Detection Algorithm on FPGA