Reading is Believing: Revisiting Language Bottleneck Models for Image Classification
We revisit language bottleneck models as an approach to ensuring the explainability of deep learning models for image classification. Because of inevitable information loss incurred in the step of converting images into language, the accuracy of language bottleneck models is considered to be inferior to that of standard black-box models. Recent image captioners based on large-scale foundation models of Vision and Language, however, have the ability to accurately describe images in verbal detail to a degree that was previously believed to not be realistically possible. In a task of disaster image classification, we experimentally show that a language bottleneck model that combines a modern image captioner with a pre-trained language model can achieve image classification accuracy that exceeds that of black-box models. We also demonstrate that a language bottleneck model and a blackbox model may be thought to extract different features from images and that fusing the two can create a synergistic effect, resulting in even higher classification accuracy.
- Book Chapter
1
- 10.1007/978-981-19-7184-6_25
- Jan 1, 2023
In the development of social economy and scientific and technological innovation, the image processing mode and classification model chosen by network technology platform is becoming more and more changeable, but in essence, it is necessary to obtain characteristic information in effective image recognition and choose high-quality network algorithm and processing technology to complete image processing and image classification. Therefore, on the basis of understanding the current research trend of computer image processing and image classification model methods, this paper conducts in-depth discussion on the image processing methods and image classification model training design with artificial intelligence as the core and takes the image classification model of transfer learning as an example for practical exploration. The final results show that the image processing method and image classification model based on artificial intelligence have strong performance advantages in practical application.KeywordsArtificial intelligenceImage processingImage classificationThe migration study
- Conference Article
16
- 10.1109/picc51425.2020.9362375
- Dec 17, 2020
Image Classification is the task of assigning an input image to a label from a set of fixed labels. This is one of the main problems in computer vision that have many practical applications. For any classification problem, the main aim is to achieve better classification accuracy. If the classification accuracy is less, then misclassification happens and this will leads to different kinds of problems. Many of the classification models only consider the existing class instances. When a new class instance arrives the classification model not detect it properly. They actually misclassified the new class instance into an existing class instance. The proposed method therefore shows a better accurate classification and new class detection model for images. Also if needed, then the new class can be added with the model to classify correctly in the future. Recent studies show that Convolutional Neural Network(CNN) can be effectively used for image classification tasks. So here creating this better accurate classification and new class detection model based on CNN. The detection of a new class is done by looking into the trend of the softmax prediction score of class labels. In this work, the model is built for CIFAR10 image dataset. This dataset is actually a complex dataset, so creating a model for this dataset can consider as a base and extended for the classification and new class detection in other images in different applications.
- Research Article
15
- 10.1016/j.dibe.2023.100144
- Mar 17, 2023
- Developments in the Built Environment
Fused deep neural networks for sustainable and computational management of heat-transfer pipeline diagnosis
- Conference Article
2
- 10.1109/aidas47888.2019.8970757
- Sep 1, 2019
Image Classification (IC) is most prominent among other Artificial Intelligence (AI) domains. Mainly, IC participates rigorously for the development of society in a variety of application areas such as finance, marketing, health, industrial automation, education, and safety and security. Typically, an IC model takes image input data and tunes itself as per the required application task and classify accordingly. Among the various categories of images, color image category is better due to the capability of capturing more details, which are essential for classification purpose. However, the modern world demands Realtime or online image classification, which involves Imagery Streams. The highly likely uncertainty in Imagery Streams is due to non-stationary environment, for example, certain features or class boundaries which are valid at one-time step are not adequate for another time step. These uncertainties in Imagery Streams have deleterious effects on IC models, which causes performance degradation in terms of accuracy or make IC models, not in further use. Therefore, to overcome these issues, IC models need to adapt to changes caused by uncertainties in Imagery Streams. This paper focuses on the understanding the possible scenarios of such uncertainties in Color Imagery Streams, investigates the deleterious effects due to changes in Color Imagery Streams and provides the possible mitigation approach to overcome the issues in IC models. The contribution of this research is the first step towards an adaptive model development to mitigate the deleterious effects of uncertainty in Color Imagery Streams. This model will benefit many application areas and will directly contribute to the daily life of a society.
- Research Article
2
- 10.1360/n092016-00405
- Sep 1, 2017
- SCIENTIA SINICA Technologica
Most popular image classification methods mainly focus on classification ability rather than recognizing new things. However, human lay emphasis on cognition first and then classification, which is closely related to human memory system. Though many memory models have been proposed, they are studied in word list whereas the reports about natural images are still limited. This paper proposes a memory model for image recognition and classification based on convolutional neural network and Bayesian decision. First the image feature is extracted by convolutional neural network and stored in binary form. Then the representation, storage and retrieval processes of visual images are modeled. The test image feature vector is matched in parallel to the studied image vectors, and the likelihood values are calculated. Finally, the odd that the test image belongs to a new class is computed based on all likelihood values. If the odd value is greater than a certain threshold, the test image is regarded as new; otherwise, the Bayesian decision rule for image classification is performed. Experimental results on Caltech-101 and Caltech-256 datasets show that the proposed method can perform well in image recognition and classification tasks. And the hit probability of the method is higher than two typical methods, SRC and ELM, at present while the false alarm rate is far lower than them.
- Research Article
3
- 10.1016/j.ijpharm.2025.125690
- Jun 1, 2025
- International journal of pharmaceutics
Deep learning-based image classification and quantification models for tablet sticking.
- Research Article
16
- 10.1109/tdsc.2022.3202544
- Jul 1, 2023
- IEEE Transactions on Dependable and Secure Computing
Mitigating adversarial deep learning attacks remains challenging, partly because of the ease and low cost in carrying out such attacks. Therefore, in this paper, we focus on the understanding of universal adversarial example attack on image classification models. Specifically, we seek to understand the difference(s) between adversarial examples in two adversarial datasets (DAmageNet and PGD dataset) and clean examples in ImageNet learned by the classification model, and whether we can use such findings to resist adversarial example attacks. We also seek to determine if we can retrain a discriminator to discriminate whether the input image is an adversarial example, using adversarial training. We then design a number of experiments (e.g., class activation map (CAM) analysis, feature map analysis, feature maps/filters changing, adversarial training, and binary classification model) to help us determine whether the universal adversarial dataset can be successfully used to attack the classification model. This, in turn, contributes to a better understanding of adversarial defenses over pretrained classification model from an interpretation perspective. To the best of our knowledge, this work is one of the earliest works to systematically investigate the interpretation of universal adversarial example attack on image classification models, both visually and quantitatively.
- Book Chapter
2
- 10.1007/978-981-19-7402-1_40
- Jan 1, 2023
Recently, number of medical X-ray images being generated is increasing rapidly due to the advancements in radiological equipment in medical centres. Medical X-ray image classification techniques are needed for effective decision making in the healthcare sector. Since the traditional image classification models are ineffective to accomplish maximum X-ray image classification performance, deep learning (DL) models have emerged. In this study, an Arithmetic Optimization Algorithm with Deep Learning-Based Medical X-Ray Image Classification (AOADL-MXIC) model has been developed. The proposed AOADL-MXIC model investigates the available X-ray images for the identification of diseases. Initially, the AOADL-MXIC model executes the pre-processing step using the Gabor filtering (GF) technique to eliminate the presence of noise. In the next level, the Capsule Network (CapsNet) model is utilized to derive feature vectors from the input X-ray images. Furthermore, for optimizing the hyperparameters related to the CapsNet approach, the AOA is exploited. Finally, the bidirectional gated recurrent unit (BiGRU) model is employed for the classification of medical X-ray images. The experimental result analysis of the AOADL-MXIC technique on a set of medical images stated the promising performance over the other models.KeywordsX-ray imagesArithmetic optimization algorithmDeep learningFeature extractionHyperparameter tuning
- Conference Article
16
- 10.1063/5.0068797
- Jan 1, 2021
- AIP conference proceedings
We introduce the Plant Disease Detection Platform (PDDP) that allows users to send photos of sick plant leaves or textual descriptions of their appearance to obtain information about an infection that hit the vegetation and treatment tips. The backend of the platform in terms of deep learning includes image classification and text similarity models. The image classification model has two parts: feature extractor and classifier. The feature extractor was trained using the triplet loss function along with transfer learning when the weights of the network are initialized from the MobileNetV2 pretrained on the ImageNet dataset. The classifier is a simple multilayer perceptron. The test on 100 random plant images from the Internet shows 98% of the classification accuracy. We did the post-training static quantization in order to reduce the overall model size and increase the inference performance. The final model has a size of 7 Mb and works 5 times faster than the initial model without significant loss of accuracy. The text similarity model is a BERT-based transformer for obtaining vector representation of input texts for further similarity calculation between user requests and disease descriptions on the PDDP.
- Research Article
64
- 10.1155/2022/3351256
- Jul 19, 2022
- Advances in Multimedia
Not only were traditional artificial neural networks and machine learning difficult to meet the processing needs of massive images in feature extraction and model training but also they had low efficiency and low classification accuracy when they were applied to image classification. Therefore, this paper proposed a deep learning model of image classification, which aimed to provide foundation and support for image classification and recognition of large datasets. Firstly, based on the analysis of the basic theory of neural network, this paper expounded the different types of convolution neural network and the basic process of its application in image classification. Secondly, based on the existing convolution neural network model, the noise reduction and parameter adjustment were carried out in the feature extraction process, and an image classification depth learning model was proposed based on the improved convolution neural network structure. Finally, the structure of the deep learning model was optimized to improve the classification efficiency and accuracy of the model. In order to verify the effectiveness of the deep learning model proposed in this paper in image classification, the relationship between the accuracy of several common network models in image classification and the number of iterations was compared through experiments. The results showed that the model proposed in this paper was better than other models in classification accuracy. At the same time, the classification accuracy of the deep learning model before and after optimization was compared and analyzed by using the training set and test set. The results showed that the accuracy of image classification had been greatly improved after the model proposed in this paper had been optimized to a certain extent.
- Research Article
32
- 10.1016/j.compeleceng.2022.108176
- Jun 24, 2022
- Computers and Electrical Engineering
Mayfly optimization with deep learning enabled retinal fundus image classification model
- Research Article
1
- 10.1088/1755-1315/792/1/012037
- Jun 1, 2021
- IOP Conference Series: Earth and Environmental Science
Pest management is an essential part of the growth of crops. Accurately identifying the types of pests in the early stage is conducive to formulating targeted prevention and control measures to reduce pests’ impact on grain production. In order to identify the pests in the larval stage as early as possible, in this paper, we compare the conventional classification model and the fine-grained classification model and construct a fine-grained image classification model that can be used to classify the larvae of crop pests, which improves the ability to identify pests in the larval stage. Experiments show that our optimized fine-grained classification model surpasses the general convolution image classification model on the fine-grained agricultural pest dataset AgrFIP20.
- Research Article
24
- 10.1007/s10664-021-09985-1
- Jun 18, 2021
- Empirical Software Engineering
Deep Neural Network (DNN) models are widely used for image classification. While they offer high performance in terms of accuracy, researchers are concerned about if these models inappropriately make inferences using features irrelevant to the target object in a given image. To address this concern, we propose a metamorphic testing approach that assesses if a given inference is made based on irrelevant features. Specifically, we propose two metamorphic relations (MRs) to detect such unreliable inferences. These relations expect (a) the classification results with different labels or the same labels but less certainty from models after corrupting the relevant features of images, and (b) the classification results with the same labels after corrupting irrelevant features. The inferences that violate the metamorphic relations are regarded as unreliable inferences. Our evaluation demonstrated that our approach can effectively identify unreliable inferences for single-label classification models with an average precision of 64.1% and 96.4% for the two MRs, respectively. As for multi-label classification models, the corresponding precision for MR-1 and MR-2 is 78.2% and 86.5%, respectively. Further, we conducted an empirical study to understand the problem of unreliable inferences in practice. Specifically, we applied our approach to 18 pre-trained single-label image classification models and 3 multi-label classification models, and then examined their inferences on the ImageNet and COCO datasets. We found that unreliable inferences are pervasive. Specifically, for each model, more than thousands of correct classifications are actually made using irrelevant features. Next, we investigated the effect of such pervasive unreliable inferences, and found that they can cause significant degradation of a model’s overall accuracy. After including these unreliable inferences from the test set, the model’s accuracy can be significantly changed. Therefore, we recommend that developers should pay more attention to these unreliable inferences during the model evaluations. We also explored the correlation between model accuracy and the size of unreliable inferences. We found the inferences of the input with smaller objects are easier to be unreliable. Lastly, we found that the current model training methodologies can guide the models to learn object-relevant features to certain extent, but may not necessarily prevent the model from making unreliable inferences. We encourage the community to propose more effective training methodologies to address this issue.
- Research Article
182
- 10.1016/j.compag.2021.106081
- Mar 13, 2021
- Computers and Electronics in Agriculture
Performance of deep learning models for classifying and detecting common weeds in corn and soybean production systems
- Research Article
5
- 10.4314/dujopas.v9i3b.30
- Nov 1, 2023
- Dutse Journal of Pure and Applied Sciences
Breast cancer is a global health issue that necessitates precise classification for early detection and effective treatment. In recent years, pre-trained models have shown great potential in the field of medical image classification, including breast cancer classification. These models have been trained on extensive datasets, and they possess the ability to capture intricate features and patterns within medical images, facilitating accurate classification. However, some of the models are non-generic. They can be sensitive to dataset biases, leading to over fitting on specific patterns present in the training data, and they equally struggle to handle data from different distributions. In this work, we proposed a generic hybrid model for image classification. The features were extracted from two datasets: the mammographic image analysis society (MIAS) and the INbreast dataset, respectively, through the pre trained EfficientNetB2 architecture. However, three classifiers were used in the image classification of the extracted features: MGSVM, CUBIC SVM, and XGBOOST. Eight evaluation metrics were selected to assess the performance of the proposed models. These metrics include accuracy, precision, F1-score, AUC, sensitivity, false negative rate (FNR), Kappa score, and time complexity. Experimental results show that the hybrid of EfficientNetB2 and the MGSVM classifier is more generic and efficient for breast cancer diagnosis and classification. It exhibits a strong performance when classifying mammography breast images from both datasets, achieving impressive metrics such as an overall accuracy of 99.47%, a sensitivity rate of 99.31%, precision of 99.44%, F1-score of 99.44%, AUC of 99.44%, a low FNR (False Negative Rate) of 0.007, a kappa score of 0.98, and a manageable time complexity of 231.44 seconds on the MIAS Dataset.