Performance of deep learning models for classifying and detecting common weeds in corn and soybean production systems
Performance of deep learning models for classifying and detecting common weeds in corn and soybean production systems
- Research Article
4
- 10.30574/wjarr.2024.21.1.0006
- Jan 30, 2024
- World Journal of Advanced Research and Reviews
Training models for image classification is a very time-consuming task. It has always been a challenge for researchers and practitioners to train a model partly because of the large dataset required, which is complex and sometimes almost impossible to source. This has recently led to the use of pre-trained models for image classifications. Pre-trained models have gained popularity because they initialize the model with appropriate weight and significantly reduce the training time and dataset required. There are many image classification pre-trained models in use today, and this paper will focus on investigating the performance of the ten top models (ConvNext, DenseNet, EfficientNet, InceptionResNet, Inception, mobileNet, ResNet, VGG, Xception, NasNet) using caltech101 dataset containing 101 object classes and caltech256 dataset containing 256 object classes. The models are all trained on the ImageNet-1k dataset. Tensorflow and Keras were used as the frameworks for developing the experiments. The accuracy, precision, recall, and f1-score were used as metrics for model performance evaluation. The findings and analysis underscore the significance of training time, number of epochs, and choice of model in image classification.
- Book Chapter
1
- 10.1007/978-981-19-7184-6_25
- Jan 1, 2023
In the development of social economy and scientific and technological innovation, the image processing mode and classification model chosen by network technology platform is becoming more and more changeable, but in essence, it is necessary to obtain characteristic information in effective image recognition and choose high-quality network algorithm and processing technology to complete image processing and image classification. Therefore, on the basis of understanding the current research trend of computer image processing and image classification model methods, this paper conducts in-depth discussion on the image processing methods and image classification model training design with artificial intelligence as the core and takes the image classification model of transfer learning as an example for practical exploration. The final results show that the image processing method and image classification model based on artificial intelligence have strong performance advantages in practical application.KeywordsArtificial intelligenceImage processingImage classificationThe migration study
- Research Article
5
- 10.3390/electronics12132956
- Jul 5, 2023
- Electronics
This paper presents a supervised learning scheme that employs key-frame extraction to enhance the performance of pre-trained deep learning models for object detection in surveillance videos. Developing supervised deep learning models requires a significant amount of annotated video frames as training data, which demands substantial human effort for preparation. Key frames, which encompass frames containing false negative or false positive objects, can introduce diversity into the training data and contribute to model improvements. Our proposed approach focuses on detecting false negatives by leveraging the motion information within video frames that contain the detected object region. Key-frame extraction significantly reduces the human effort involved in video frame extraction. We employ interactive labeling to annotate false negative video frames with accurate bounding boxes and labels. These annotated frames are then integrated with the existing training data to create a comprehensive training dataset for subsequent training cycles. Repeating the training cycles gradually improves the object detection performance of deep learning models to monitor a new environment. Experiment results demonstrate that the proposed learning approach improves the performance of the object detection model in a new operating environment, increasing the mean average precision (mAP@0.5) from 54% to 98%. Manual annotation of key frames is reduced by 81% through the proposed key-frame extraction method.
- Research Article
4
- 10.1166/jmihi.2022.3936
- Feb 1, 2022
- Journal of Medical Imaging and Health Informatics
The integration of various algorithms in the medical field to diagnose brain disorders is significant. Generally, Computed Tomography, Magnetic Resonance Imaging techniques have been used to diagnose brain images. Subsequently, segmentation and classification of brain disease remain an exigent task in medical image processing. This paper presents an extended model for brain image classification based on a Modified pre-trained convolutional neural network model with extensive data augmentation. The proposed system has been efficiently trained using the technique of substantial data augmentation in the pre-processing stage. In the first phase, the pre-trained models namely AlexNet, VGGNet-19, and ResNet-50 are employed to classify the brain disease. In the second phase, the idea of integrating the existing pre-trained model with a multiclass linear support vector machine is incorporated. Hence, the SoftMax layer of pre-trained models is replaced with a multi class linear support vector machine classifier is proposed. These proposed modified pre-trained model is employed to classify brain images as normal, inflammatory, degenerative, neoplastic and cerebrovascular diseases. The training loss, mean square error, and classification accuracy have been improved through the concept of Cyclic Learning rate. The appropriateness of transfer learning has been demonstrated by applying three convolutional neural network models, namely, AlexNet, VGGNet-19, and ResNet-50. It has been observed that the modified pre-trained models achieved a higher classification rate of accuracies of 93.45% when compared with a finetuned pre-trained model of 89.65%. The best classification accuracy of 92.11%, 92.83% and 93.45% has been attained in the proposed method of the modified pre-trained model. A comparison of the proposed model with other pre-trained models is also presented.
- Research Article
1
- 10.1109/tpami.2025.3622234
- Jan 1, 2025
- IEEE transactions on pattern analysis and machine intelligence
Deep neural networks (DNNs) have proven to be successful in various computer vision applications such that models even infer in safety-critical situations. Therefore, vision models have to behave in a robust way to disturbances such as noise or blur. While seminal benchmarks exist to evaluate model robustness to diverse corruptions, blur is often approximated in an overly simplistic way to model defocus, while ignoring the different blur kernel shapes that result from optical systems. To study model robustness against realistic optical blur effects, this paper proposes two datasets of blur corruptions, which we denote OpticsBench and LensCorruptions. OpticsBench examines primary aberrations such as coma, defocus, and astigmatism, i.e. aberrations that can be represented by varying a single parameter of Zernike polynomials. To go beyond the principled but synthetic setting of primary aberrations, LensCorruptions samples linear combinations in the vector space spanned by Zernike polynomials, corresponding to 100 real lenses. Evaluations for image classification and object detection on ImageNet and MSCOCO show that for a variety of different pre-trained models, the performance on OpticsBench and LensCorruptions varies significantly, indicating the need to consider realistic image corruptions to evaluate a model's robustness against blur.
- Research Article
16
- 10.1007/s00170-023-10973-6
- Feb 18, 2023
- The International Journal of Advanced Manufacturing Technology
Especially in manufacturing systems with small batches or customized products, as well as in remanufacturing and recycling facilities, there is a wide variety of part types that may be previously unseen. It is crucial to accurately identify these parts based on their type for traceability or sorting purposes. One approach that has shown promising results for this task is deep learning–based image classification, which can classify a part based on its visual appearance in camera images. However, this approach relies on large labeled datasets of real-world images, which can be challenging to obtain, especially for parts manufactured for the first time or whose appearance is unknown. To overcome this challenge, we propose generating highly realistic synthetic images based on photo-realistically rendered computer-aided design (CAD) data. Using this commonly available source, we aim to reduce the manual effort required for data generation and preparation and improve the classification performance of deep learning models using transfer learning. In this approach, we demonstrate the creation of a parametric rendering pipeline and show how it can be used to train models for a 30-class classification problem with typical engineering parts in an industrial use case. We also demonstrate how our method’s entropy gain improves the classification performance in various deep image classification models.
- Research Article
- 10.32628/cseit228140
- Jan 1, 2022
- International Journal of Scientific Research in Computer Science, Engineering and Information Technology
Machine Learning is vast field which finds its application in almost every field. The image classification is one of the important application of Supervised Machine learning algorithms. Image classification is basically concerned with identifying the objects in the images. The complexity of this task is dependent on the image features and type of images. For the research work here, the hyperspectral images are considered for deep learning based image classification. The object detection in the Hyperspectral images have applications in various areas including defense, precision agriculture, atmospheric analysis, environmental analysis, anomaly detection, fraud detection , etc. The work presented here is divided into broad survey of image classification methods using machine learning and deep learning methods. Continuing with this work, the further work presents object detection methods in ML and DL. The later work presents the deep review of the research articles over Hyperspectral image classification using Machine Learning and Deep Learning Algorithms. A lot of challenges are present to solve the object detection problems in Hyperspectral images. The later section of this work describes the object detection based on Hyperspectral images survey in detail highlighting the major developments.
- Research Article
18
- 10.3390/cancers15164144
- Aug 17, 2023
- Cancers
Simple SummaryThis research study investigates the impact of stain normalization on deep learning models for cancer image classification by evaluating model performance, complexity, and trade-offs. The primary objective is to assess the improvement in accuracy, performance, and resource optimization of deep learning models through the standardization of visual appearance in histopathology images using stain normalization techniques, alongside batch size and image size optimization. The findings provide valuable insights for selecting appropriate deep learning models in achieving precise cancer classification, considering the effects of H&E stain normalization and computational resource availability. This study contributes to the existing knowledge on the performance, complexity, and trade-offs associated with applying deep learning models to cancer image classification tasks.Accurate classification of cancer images plays a crucial role in diagnosis and treatment planning. Deep learning (DL) models have shown promise in achieving high accuracy, but their performance can be influenced by variations in Hematoxylin and Eosin (H&E) staining techniques. In this study, we investigate the impact of H&E stain normalization on the performance of DL models in cancer image classification. We evaluate the performance of VGG19, VGG16, ResNet50, MobileNet, Xception, and InceptionV3 on a dataset of H&E-stained cancer images. Our findings reveal that while VGG16 exhibits strong performance, VGG19 and ResNet50 demonstrate limitations in this context. Notably, stain normalization techniques significantly improve the performance of less complex models such as MobileNet and Xception. These models emerge as competitive alternatives with lower computational complexity and resource requirements and high computational efficiency. The results highlight the importance of optimizing less complex models through stain normalization to achieve accurate and reliable cancer image classification. This research holds tremendous potential for advancing the development of computationally efficient cancer classification systems, ultimately benefiting cancer diagnosis and treatment.
- Conference Article
12
- 10.1109/raeeucci57140.2023.10134069
- Apr 19, 2023
In the realm of computer vision, image classification is a critical issue with many applications, including multimedia content analysis, security and surveillance, and medical imaging. The accuracy of image classification algorithms has considerably increased with the development of deep learning. The discipline of image classification has dramatically benefited from the usage of pre-trained Convolutional Neural Network (CNN) models. In this study, we perform image classification on the Wang dataset composed of 1000 images separated into ten categories, using six different pre-trained CNN models. The research made use of pre-trained versions of VGG16, Densenet, Mobilenet, Inception V3, Resnet50, and Xception models. We assessed the model performances in terms of training and testing accuracy. Testing accuracies for ten unique batches of data were calculated and averaged out. Densenet outperformed state-of-the-art models like Xception by a small margin, mainly due to low data availability. The findings of this study show how pre-trained models have advanced the field of image classification and shed light on how well these models perform in diverse image classification tasks. This study can serve as a valuable reference for researchers and practitioners in the field of computer vision and deep learning and help inform their choices when selecting pre-trained models for image classification, depending on their needs, while also considering data availability and computational constraints.
- Research Article
- 10.13031/ja.14895
- Jan 1, 2022
- Journal of the ASABE
Highlights An approach using deep learning was proposed for identifying diseased regions in UAS imagery of corn fields with 97.23% testing accuracy using the VGG16 model. Disease types were identified within the diseased regions with a testing accuracy of 98.85% using the VGG16 model. On the diseased leaves, severity was estimated with a testing accuracy of 94.20% using the VGG16 model. Deep Learning models have the potential to bring efficiency and accuracy to field scouting. Abstract. Accurately locating diseased regions, identifying disease types, and estimating disease severity in corn fields are all connected steps for developing an effective disease management system. Traditional disease management that relied on a manual scouting approach was inefficient. Therefore, the research community is working on developing advanced disease management systems using deep learning. However, most of the past studies used public datasets consisting of images with uniform backgrounds acquired under lab conditions to train deep learning models, thus, limiting their use under field conditions. In addition, limited studies have been conducted for in-field corn disease analysis using Unmanned Aerial System (UAS) imagery. Therefore, UAS and handheld imagery sensors were used in this study to acquire corn disease images from fields located at Purdue University’s Agronomy Center for Research and Education (ACRE) in the summer of 2020. A total of 55 UAS flights were conducted over three different corn fields from June 20 through September 29, resulting in a collection of approximately 59,000 images. A novel three-stage approach was proposed by independently training a total of nine image classification models using three neural network architectures, namely: VGG16, ResNet50, and InceptionV3, for locating diseased regions, identifying disease types, and estimating disease severity under field conditions. Diseased regions were first identified accurately in UAS-acquired corn field imagery by a sliding window and deep learning-based image classification, with testing accuracies of up to 97.23%. Diseased region identification was followed by accurately identifying three common corn diseases, namely Northern Leaf Blight (NLB), Gray Leaf Spot (GLS), and Northern Leaf Spot (NLS), within the diseased regions with testing accuracies of up to 98.85%. Finally, the severity of the NLS disease on leaves was estimated with a testing accuracy of up to 94.20%. The VGG16 model achieved the highest testing accuracies for identifying diseased regions in corn fields, identifying corn disease types, and estimating NLS's severity. This study presents promising results for three main elements of a disease management system and could advance traditional scouting by integrating deep learning with UAS imagery. Keywords: Corn Diseases, Datasets, Deep Learning, Disease Identification, Disease Region Location, Image Classification, Severity Estimation, UAS Imagery.
- Research Article
5
- 10.4314/dujopas.v9i3b.30
- Nov 1, 2023
- Dutse Journal of Pure and Applied Sciences
Breast cancer is a global health issue that necessitates precise classification for early detection and effective treatment. In recent years, pre-trained models have shown great potential in the field of medical image classification, including breast cancer classification. These models have been trained on extensive datasets, and they possess the ability to capture intricate features and patterns within medical images, facilitating accurate classification. However, some of the models are non-generic. They can be sensitive to dataset biases, leading to over fitting on specific patterns present in the training data, and they equally struggle to handle data from different distributions. In this work, we proposed a generic hybrid model for image classification. The features were extracted from two datasets: the mammographic image analysis society (MIAS) and the INbreast dataset, respectively, through the pre trained EfficientNetB2 architecture. However, three classifiers were used in the image classification of the extracted features: MGSVM, CUBIC SVM, and XGBOOST. Eight evaluation metrics were selected to assess the performance of the proposed models. These metrics include accuracy, precision, F1-score, AUC, sensitivity, false negative rate (FNR), Kappa score, and time complexity. Experimental results show that the hybrid of EfficientNetB2 and the MGSVM classifier is more generic and efficient for breast cancer diagnosis and classification. It exhibits a strong performance when classifying mammography breast images from both datasets, achieving impressive metrics such as an overall accuracy of 99.47%, a sensitivity rate of 99.31%, precision of 99.44%, F1-score of 99.44%, AUC of 99.44%, a low FNR (False Negative Rate) of 0.007, a kappa score of 0.98, and a manageable time complexity of 231.44 seconds on the MIAS Dataset.
- Conference Article
2
- 10.1109/aidas47888.2019.8970757
- Sep 1, 2019
Image Classification (IC) is most prominent among other Artificial Intelligence (AI) domains. Mainly, IC participates rigorously for the development of society in a variety of application areas such as finance, marketing, health, industrial automation, education, and safety and security. Typically, an IC model takes image input data and tunes itself as per the required application task and classify accordingly. Among the various categories of images, color image category is better due to the capability of capturing more details, which are essential for classification purpose. However, the modern world demands Realtime or online image classification, which involves Imagery Streams. The highly likely uncertainty in Imagery Streams is due to non-stationary environment, for example, certain features or class boundaries which are valid at one-time step are not adequate for another time step. These uncertainties in Imagery Streams have deleterious effects on IC models, which causes performance degradation in terms of accuracy or make IC models, not in further use. Therefore, to overcome these issues, IC models need to adapt to changes caused by uncertainties in Imagery Streams. This paper focuses on the understanding the possible scenarios of such uncertainties in Color Imagery Streams, investigates the deleterious effects due to changes in Color Imagery Streams and provides the possible mitigation approach to overcome the issues in IC models. The contribution of this research is the first step towards an adaptive model development to mitigate the deleterious effects of uncertainty in Color Imagery Streams. This model will benefit many application areas and will directly contribute to the daily life of a society.
- Research Article
1
- 10.1007/s10278-025-01506-6
- Apr 25, 2025
- Journal of imaging informatics in medicine
The main aim of this study is to introduce a new hybrid deep learning model for biomedical image classification. We propose a novel convolutional neural network (CNN), named HybridNeXt, for detecting pulmonary embolism (PE) from computed tomography (CT) images. To evaluate the HybridNeXt model, we created a new dataset consisting of two classes: (1) PE and (2) control. The HybridNeXt architecture combines different advanced CNN blocks, including MobileNet, ResNet, ConvNeXt, and Swin Transformer. We specifically designed this model to combine the strengths of these well-known CNNs. The architecture also includes stem, downsampling, and output stages. By adjusting the parameters, we developed a lightweight version of HybridNeXt, suitable for clinical use. To further improve the classification performance and demonstrate transfer learning capability, we proposed a deep feature engineering (DFE) method using a multilevel discrete wavelet transform (MDWT). This DFE model has three main phases: (i) feature extraction from raw images and wavelet bands, (ii) feature selection using iterative neighborhood component analysis (INCA), and (iii) classification using a k-nearest neighbors (kNN) classifier. We first trained HybridNeXt on the training images, creating a pretrained HybridNeXt model. Then, using this pretrained model, we extracted features and applied the proposed DFE method for classification. The HybridNeXt model achieved a test accuracy of 90.14%, while our DFE model improved accuracy to 96.35%. Overall, the results confirm that our HybridNeXt architecture is highly accurate and effective for biomedical image classification. The presented HybridNeXt and HybridNeXt-based DFE methods can potentially be applied to other image classification tasks.
- Research Article
- 10.1109/tpami.2026.3654115
- May 1, 2026
- IEEE transactions on pattern analysis and machine intelligence
For privacy and security concerns, the need to erase unwanted information from pre-trained vision models is becoming evident nowadays. In real-world scenarios, erasure requests originate at any time from both users and model owners, and these requests usually form a sequence. Therefore, under such a setting, selective information is expected to be continuously removed from a pre-trained model while maintaining the rest. We define this problem as continual forgetting and identify three key challenges. (i) For unwanted knowledge, efficient and effective deleting is crucial. (ii) For remaining knowledge, the impact brought by the forgetting procedure should be minimal. (iii) In real-world scenarios, the training samples may be scarce or partially missing during the process of forgetting. To address them, we first propose Group Sparse LoRA (GS-LoRA). Specifically, towards (i), we introduce Low-Rank Adaptation (LoRA) modules to fine-tune the Feed-Forward Network (FFN) layers in Transformer blocks for each forgetting task independently, and towards (ii), a simple group sparse regularization is adopted, enabling automatic selection of specific LoRA groups and zeroing out the others. To further extend GS-LoRA to more practical scenarios, we incorporate prototype information as additional supervision and introduce a more practical approach, GS-LoRA++. For each forgotten class, we move the logits away from its original prototype. For the remaining classes, we pull the logits closer to their respective prototypes. We conduct extensive experiments on face recognition, object detection and image classification and demonstrate that our method manages to forget specific classes with minimal impact on other classes.
- Research Article
21
- 10.1109/access.2022.3225107
- Jan 1, 2022
- IEEE Access
Data scarcity is a common and challenging issue when working with Artificial Intelligence solutions, especially those including Deep Learning (DL) models for tasks such as image classification. This is particularly relevant in healthcare scenarios, in which data collection requires a long-lasting process, involving specific control protocols. The performance of DL models is usually quantified by different classification metrics, which may provide biased results, due to the lack of sufficient data. In this paper, an innovative approach is proposed to evaluate the performance of DL models when labeled data is scarce. This approach, which aims to detect the poor performance provided by DL models, in spite of traditional assessing metrics indicating otherwise, is based on information theoretic concepts and motivated by the Information Bottleneck framework. This methodology has been evaluated by implementing several experimental configurations to classify samples from a plantar thermogram dataset, focused on early stage detection of diabetic foot ulcers, as a case study. The proposed network architectures exhibited high results in terms of classification metrics. However, as our approach shows, only two of those models are indeed consistent to generalize the data properly. In conclusion, a new methodology was introduced and tested to identify promising DL models for image classification over small datasets without relying exclusively on the widely employed classification metrics.