Articles published on Models For Image Classification
Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
1531 Search results
Sort by Recency
- New
- Research Article
- 10.1061/jmcee7.mteng-20711
- May 1, 2026
- Journal of Materials in Civil Engineering
- Yue Liu + 7 more
The primary load-bearing structures of suspension bridges are the main cables, which are constructed with high-tensile-strength steel wires. Throughout the service life of a suspension bridge, the main cables not only endure cyclic loading from various loading sources but also from severe environmental conditions. These long-term applied loading conditions may result in significant deterioration of material characteristics and potentially cable failure, compromising both the longevity and security of the suspension bridge. Thus, analyzing corrosion patterns based on the main cables’ high-tensile-strength steel wires and evaluating their corrosion intensity are critically important for civil engineers. This paper utilizes a copper-accelerated salt spray test to fast generate samples of four distinct corrosion stages of steel wires. By employing the semantic segmentation model Deeplabv3+, the corrosion positions can be determined. By utilizing three image classification models—ResNet50, ShuffleNet, and DFL (Discriminative Filter Bank Learning), the stages of corrosion in samples were classified and analyzed as a reference for engineering applications.
- New
- Research Article
- 10.55041/ijsrem61314
- Apr 27, 2026
- INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT
- Rahul J Teradal + 1 more
Abstract—Adversarial attacks demonstrate the existence of severe weaknesses in deep learning image classifiers due to misclassification, caused by a small, precisely-designed input per- turbation. This work aims to experimentally test the adversarial robustness and the extent to which adversarial vulnerability can be model-dependent or systemic. There are two controlled experiments. First, on a pretrained ResNet-18 model, Fast Gra- dient Sign Method (FGSM), Projected Gradient Descent (PGD) and Carlini-Wagner (CW) attacks are applied to discuss the impact of attack sophistication. Second, cross-model robustness is compared by means of eight pretrained image classification models to evaluate FGSM for cross-model robustness. The findings indicate that optimization-based attacks are much better than single-step attacks and more complex architectures are more robust and lightweight models are more sensitive. These results suggest that adversarial vulnerability is architecture-dependent and exists irrespective of a model. The paper also discusses defense and recovery processes and points out how the principles of Responsible AI can be used to construct high-quality and trustworthy adversarial defenses. Index Terms—Adversarial Attacks; Deep Learning; Image Classification; Model Robustness; Responsible AI
- New
- Research Article
- 10.3390/jimaging12040168
- Apr 14, 2026
- Journal of imaging
- Tomiris M Zhaksylyk + 5 more
Breast cancer histopathology classification remains a fundamental challenge in computational pathology due to variations in tissue morphology across magnification levels. Convolutional neural networks (CNNs) have long been the standard for image-based diagnosis, yet recent advances in vision-language models (VLMs) suggest they may provide strong and transferable representations for complex medical images. In this study, we present a systematic comparison between CNN baselines and large VLMs-Qwen2 and SmolVLM-fine-tuned with Low-Rank Adaptation (LoRA; r=16, α=32, dropout = 0.05) on the BreakHis dataset. Models were evaluated at 40×, 100×, 200×, and 400× magnifications using accuracy, precision, recall, F1-score, and area under the ROC curve (AUC). While Qwen2 achieved moderate performance across magnifications (e.g., 0.8736 accuracy and 0.9552 AUC at 200×), SmolVLM consistently outperformed Qwen2 and substantially reduced the gap with CNN baselines, reaching up to 0.9453 accuracy and 0.9572 F1-score at 200×-approaching the performance of AlexNet (0.9543 accuracy) at the same magnification. CNN baselines, particularly ResNet34, remained the strongest models overall, achieving the highest performance across all magnifications (e.g., 0.9879 accuracy and 0.9984 AUC at 40×). These findings demonstrate that LoRA fine-tuned VLMs, despite requiring gradient accumulation and memory-efficient optimizers and operating with a significantly smaller number of trainable parameters, can achieve competitive performance relative to traditional CNNs. However, CNN-based architectures still provide the highest accuracy and robustness for histopathology classification. Our results highlight the potential of VLMs as parameter-efficient alternatives for digital pathology tasks, particularly in resource-constrained settings.
- Research Article
- 10.1016/j.surg.2025.110079
- Apr 1, 2026
- Surgery
- Armin Alipour + 6 more
Integration of spatiotemporal features into machine learning assessment of open surgical skills.
- Research Article
- 10.1016/j.neucom.2026.133661
- Apr 1, 2026
- Neurocomputing
- Garas Gendy + 1 more
GMambaHSI: Group-based visual state space model for hyperspectral image classification
- Research Article
- 10.17116/rosrino20263401119
- Mar 27, 2026
- Russian Rhinology
- E.O Bryanskaya + 6 more
Currently, the issue of diagnosis of pathological changes in the maxillary sinus (MS) is an urgent task. The digital diaphanoscopy method, which allows to visualize the tissues of sinuses through the use of probing optical radiation in red and near-infrared ranges, seems promising for the early diagnosis of this pathology. However, there is a need to improve the accuracy of this method, reduce the time of study and simplify the process of recorded images (diaphanograms) classification by creating a medical decision making support system (MDMSS). Objective. To develop a MDMSS for classification of diaphanograms of digital diaphanoscopy on the basis of a convolutional neural network (CNN). Patients and methods. The study involved 80 healthy volunteers and 76 patients with MS pathology. Diaphanograms were recorded using a digital diaphanoscopy device at two probing wavelengths (650 and 850 nm). Analysis of diaphanograms (160 diaphanograms of conditionally healthy volunteers, 78 diaphanograms of patients with sinusitis and 32 diphanograms of patients with MS cyst) was carried out using a developed image classification model based on ResNet-50 CNN. Results. High accuracy values (sensitivity of 0.95 and specificity of 0.88), which exceeded all previously proposed developments based on linear discriminant analysis, were obtained. The problem of MS pathology differentiation into sinusitis and cystic fluid classes was solved by means of developed MDMSS. Conclusion. The developed classification model can be applied for digital diaphanoscopy for the purpose of early detection of MS pathological changes in telemedicine and automated ENT consultations using MDMSS. Analysis of the results showed the need to expand the database for further training of the classification model.
- Research Article
- 10.1038/s41598-026-41153-7
- Mar 21, 2026
- Scientific reports
- N M Saravana Kumar + 3 more
The Biomedical image analysis is also crucial in the contemporary healthcare system as it facilitates correct diagnosis of diseases, treatment planning, and clinical decision-making. As novel imaging methods like MRI, histopathology and chest X-ray have appeared, the necessity of automated systems that could effectively work with extensive image data of different complexity has grown. The main difficulty, however, is to strike the right balance between the complexity of biomedical images and computational efficiency because, most of the time, these images are large-scale, high-resolution, and have a large tissue variation, noise, and inter and intra-class differences. Conventional deep learning models such as CNNs and RNNs cannot handle such issues and at the same time achieve high diagnostic accuracy and computational efficiency particularly in real-time clinical scenarios. The current paper introduces the pseudo-name ImTranNet-TriCore, which is a new model of deep learning developed to facilitate the categorization of biomedical images. The suggested model combines three important innovations including a Learnable Multi-Scale Adaptive Filtering module (LM-AdaFilter), a Dual-Path Attentive Residual SRU (DP-AtRes-SRU) and a Multi-Head Hybrid Transformer (MHHT). The components all contribute to the challenging issues of noise-reduction, spatial-temporal features learning, and global-contextual reasoning in the biomedical images. LM-AdaFilter is a dynamically-adjusted filtering parameter to retain diagnostically important features, whereas DP-AtRes-SRU can capture the spatial as well as the temporal relationships. The MHHT reconciles the local features and global context to increase feature fusion and boost classification accuracy. The main goal of the paper is to suggest the computationally effective and interpretable biomedical image classification model. The ImTranNet-TriCore model was put to test on conventional biomedical datasets and it performed at 95.92 accuracy. Precision (97.83% in Brain MRI, 97.67% in Chest X-ray), Recall (93.75% in Brain MRI, 87.50% in Chest X-ray) and F1-Score indicated the strong performance of the model in distinguishing between positive and negative cases, as well as reducing the number of false positives. The findings emphasize the fact that ImTranNet-TriCore is superior to the conventional models such as CNNs, RNNs, and standalone transformers, in working on complex and noisy biomedical data, and is therefore applicable in the real-life clinical context.
- Research Article
- 10.1007/s00414-026-03763-8
- Mar 20, 2026
- International journal of legal medicine
- Kubra Yildirim + 9 more
In deaths due to injury, photographs of changes on deceased bodies are routinely taken during the forensic examination; the task of differentiating the types of fatal injury can be posed as an image classification problem. We aimed to develop a machine learning model for automated classification of the cause of injury-induced deaths based on postmortem images of external body regions. We collected a dataset comprising 4254 autopsy images of various body parts divided into six classes according to the cause of death: (i) crush (1808), (ii) choking (327), (iii) stabbing (977), (iv) gunshot (765), (v) burns (254), and (vi) drowning (127). Our model, DRDarkNet, comprised four phases: feature extraction; feature selection; classification; and information fusion. DenseNet201, ResNet50, and DarkNet53 pre-trained on the ImageNet-1 K dataset were deployed to generate six feature vectors of different lengths using the fully connected and global average pooling layers of the individual networks. Neighborhood component analysis (NCA), Chi2, and ReliefF functions were used to create 18 (= 6 × 3) selected feature vectors of identical length (512) with reduced dimensionality that contained the most discriminative features. These selected feature vectors were then fed to a support vector machine classifier to generate 18 classifier-wise outputs. Novel pruning-based iterative majority voting (PIMV) was used to aggregate the classifier-wise outputs, from which voted outputs were generated. From both classifier-wise and voted outputs, the most accurate output was automatically chosen, rendering the model self-organized. DRDarkNet outputs both classifier-wise results and voted results, attaining an excellent 96.47% overall multiclass classification accuracy.
- Research Article
- 10.3390/sym18030527
- Mar 19, 2026
- Symmetry
- Longyan Qin + 2 more
PCB defect images suffer from tiny defects, subtle morphological differences and complex background wiring, making traditional single-feature classification unstable. This paper proposes a dual-branch image classification method combining a Transformer and CNN, which jointly models local anomalies and global semantic relationships. The model uses a convolutional branch and a Transformer branch to extract local defect features and global wiring dependencies, respectively. A cross-layer semantic interaction mechanism is adopted for multi-level information fusion, and a discriminative feature enhancement module is applied to highlight key defect regions and suppress background interference. Experiments show that the model improves overall accuracy by over 2%, with an F1-score of 0.930 and defect identification coverage of 0.927. It performs stably across different defect types and background complexities without obvious bias, providing new insights for hybrid deep model design in industrial defect image classification.
- Research Article
- 10.1038/s41598-026-43009-6
- Mar 16, 2026
- Scientific reports
- Shinya Matsumoto + 5 more
Sister chromatid cohesion (SCC) is mediated by a protein complex called cohesin and by regulatory proteins that control cohesin function. A commonly used approach to evaluate the involvement of cohesin regulatory proteins is to classify the shape of the chromosomes after depletion of the target protein and analyze their distribution. Currently, shape classification is often performed manually by researchers, which is not only time-consuming but also subject to individual interpretation. Therefore, our research group developed image classification models for automating chromosome shape classification. However, in this method, unclassifiable chromosomes that arise when cropping single chromosomes must be removed manually, creating a significant barrier to the fully automated detection of SCC-defective chromosomes. In this study, we propose a method that utilizes an object detection model to detect chromosomes with SCC defects without the need to crop single chromosomes. Several pretrained object detection models were selected and fine-tuned, and their performances were compared. Among the models, the one based on You Only Look Once v8 (YOLOv8) achieved a maximum concordance rate of 89.40% with manual analysis and successfully identified differences in the distribution of wild-type (WT) and DDX11−/−cells. These results indicate that the YOLOv8-based model enables fully automated analysis of SCC-defective chromosomes.
- Research Article
- 10.3390/s26061833
- Mar 14, 2026
- Sensors (Basel, Switzerland)
- Ci Liu + 2 more
Concept Bottleneck Models facilitate interpretable image classification by predicting human-understandable concepts prior to class labels. However, when constructed upon CLIP, they exhibit unreliable concept scores stemming from CLIP's global representation bias and insufficient region-level sensitivity, which severely constrain their effectiveness in sensor-driven applications like remote sensing and medical imaging where localized visual evidence is critical. To mitigate this, we propose the Local-Global Aware Concept Bottleneck Model (LGA-CBM), which improves concept prediction through a training-free refinement pipeline. Building on initial CLIP-derived concept scores, LGA-CBM incorporates three key components: a Dual Masking Guided Concept Score Refinement (DMCSR) module that exploits attention weights to strengthen region-concept alignment; a Local-to-Global Concept Reidentification (L2GCR) strategy to harmonize local and global activations; and a Similar Concepts Correction Mechanism (SCCM) integrating Grounding DINO for fine-grained disambiguation. A sparse linear layer then maps the refined concepts to class labels, enabling highly interpretable classification with minimal concept usage. Experiments across six benchmark datasets demonstrate that LGA-CBM consistently achieves state-of-the-art performance in both accuracy and interpretability, producing explanations that align closely with human cognition.
- Research Article
- 10.1007/s10278-026-01886-3
- Mar 11, 2026
- Journal of imaging informatics in medicine
- Xiaolong Yu + 4 more
With the rapid advancement of deep learning models in disease detection and medical image analysis, concerns regarding their security have become increasingly prominent. Especially under the threat of data poisoning attacks, malicious actors may tamper with data or model parameters, significantly reduce model performance, and lead to incorrect diagnoses or decisions, thereby posing a serious threat to patients' health and lives. To address this problem, we propose a novel defense scheme named Dweighted that integrates dual weighting with clustering analysis. The scheme comprehensively considers the size of each client's dataset, model parameter differences, and similarity analysis to dynamically adjust the i-th client's weight. Furthermore, it employs principal component analysis (PCA) and K-means clustering to accurately identify and eliminate malicious clients. Experimental results demonstrate that Dweighted significantly enhances the global model's security and robustness against data poisoning attacks while maintaining high classification accuracy. Compared to other baselines, Dweighted achieves an overall accuracy (All Acc) of 94.89% and reduces the attack success rate to 2.43%in the IID setting.
- Research Article
- 10.65770/qcnf4155
- Mar 9, 2026
- World Scientific News
- Olumba Confidence Chigozirim + 5 more
ABSTRACT With the increasing global demand for renewable energy, the solar photovoltaic systems have been widely deployed. However, the operation efficiency and lifespan of solar panels are under the pressure of the surface defects. Defects such as dust and bird dropping has greatly affect the performance. Traditional inspection methods such as manual visual assessment are time-consuming and expensive. With the development of deep learning technology, the image classification techniques shows bright prospects in defect detection. Unfortunately, many existing models suffer from high computational complexity and limited interpretability, which restrict their practical application. This project proposes a lightweight deep learning-based image classification model for solar panel surface condition classification, aiming at solving these limitations. The proposed model integrates depthwise separable convolutions, residual Inception style branch structures and a custom attention mechanism to achieve efficient feature extraction ability with low computational cost. In addition, three complementary explainable artificial intelligence (XAI) techniques—Grad-CAM, LIME, and Occlusion Analysis are applied to enhance the model transparency and interpretability. The experimental results of four categories of solar panel image dataset demonstrate that the proposed model achieves a test accuracy of 75.62%, an F1-score of 75.97% and a ROC-AUC score of 0.9305, showing strong discrimination capability and good generalization performance.
- Research Article
- 10.1080/00949655.2026.2636782
- Mar 6, 2026
- Journal of Statistical Computation and Simulation
- Sridevi Gadde + 4 more
In this research study, a CXR image classification model is developed to automatically detect the disease at an earlier stage. In the beginning phase, the necessary CXR images are accumulated from online resources. Further, the collected images are transferred to the second phase to execute the segmentation process. In this phase, a novel structure called Adaptive RefineNet (ARNet) is recommended for segmenting the collected CXR images. The ARNet-based segmentation process is helped to analyze different sizes of abnormalities in the CXR images. The segmentation performance of the ARNet is enhanced by fine-tuning the parameters by Augmented Language Education Optimization (ALEO). The segmented images from the ARNet are given to the final phase for performing the classification using the Residual EfficientNet with LSTM layer (RE-LSTM). Finally, simulation analysis is conducted on the developed method to prove the model's efficiency in the CXR image classification process.
- Research Article
- 10.1080/01431161.2026.2631698
- Mar 5, 2026
- International Journal of Remote Sensing
- Guoqing Zhou + 5 more
ABSTRACT The existing models based on quantum pixels can mine spectral information and entanglement features, but are time-consuming in practice due to its characteristics on pixel-level entanglement. For this reason, an innovative image classification model, named ‘Quantum Superpixel Entanglement Classification (QSEC) model’, is proposed in this study. By treating superpixels in remote sensing images (RSIs) as quantum particles, we extend the paradigm of image classification from the pixel level to the superpixel level. This design fully leverages the spectral information, textural features, and inter-superpixel quantum entanglement characteristics inherent in RSIs, enabling the effective capture and processing of complex image information via quantum entanglement-driven computational frameworks. Furthermore, the framework quantizes RSI superpixels to map their greyscale and texture features onto corresponding quantum states, constructs a quantum superpixel entanglement pattern, extracts the entanglement relationships between superpixels using entanglement coefficients, and ultimately develops a classification method applicable to multispectral images. The results from the four groups of the experiments demonstrated that: (1) The QSEC model achieves an average classification accuracy of and Kappa coefficient (KC) of 92.02% and 0.86, respectively, with an average running time of 29.25 seconds; (2) the running time of the QSEC model was decreased by up to 400 seconds, the classification speed was improved by 75.29% when compared with the neural network classification (NNC), support vector machine (SVM) and deep learning (DL) model; and (3) the QSEC model improves classification accuracy by 8.76% and increases KC by 0.12 when compared with unsupervised classification models such as ISODATA and K-means.
- Research Article
- 10.54254/2755-2721/2026.ch31978
- Mar 2, 2026
- Applied and Computational Engineering
- Nan Jiang
The performance of image classification models depends greatly on the architectural decisions made. Fashion-MNIST, as the mainstream adopted by researchers for model performance analysis, provides another avenue for the systematic comparison of different model architectures. In this paper, we have comparatively studied and analyzed the performances of Multi-Layer Perceptrons (MLP), Convolutional Neural Networks (CNN), Random Forests and Residual Networks (ResNet) on this dataset, and found that one of the reasons for the excellent performance of convolutional networks may lie in the ability of extracting spatial features inherently possessed by convolutional layers. Although a deeper ResNet-34 shows an excellent performance (91.15%), its large number of parameters makes it less efficient for general tasks. To improve the efficiency, we find that by increasing the number of channels in the first convolutional layer from 32 to 64, the achieved accuracy (92.44%) is superior to any single task, which verifies the effectiveness of width optimization. In summary, for fashion-mnist such applications, an optimized width convolutional network architecture achieves the best accuracy-to-efficiency balance. We empirically prove that for image classification tasks, model selection and light design are significantly influenced by adopting appropriate architectural optimizations.
- Research Article
6
- 10.1016/j.inffus.2025.103811
- Mar 1, 2026
- Information Fusion
- Yichu Xu + 4 more
MambaMoE: Mixture-of-spectral-spatial-experts state space model for hyperspectral image classification
- Research Article
- 10.1016/j.inffus.2025.103737
- Mar 1, 2026
- Information Fusion
- Anabia Sohail + 5 more
• ConVLM addresses the limitation of coarse alignment in existing VLMs by introducing context-guided token learning and enhancement, enabling fine-level image-text interaction that captures subtle morphological details in histology images. • The model selectively removes irrelevant visual tokens and enhances relevant ones through integrated modules across encoder layers, resulting in richer and more discriminative visual representations for downstream tasks. • ConVLM is trained using a novel context-guided token learning loss, which guides the model to focus on contextually important tissue structures, improving generalization and interoperability. • Evaluated on both ROI-level and WSI-level classification tasks, ConVLM outperforms SOTA models, demonstrating robust generalization across diverse histopathology datasets and tasks such as cancer subtype prediction and survival analysis. Vision-Language Models (VLMs) have recently demonstrated exceptional results across various Computational Pathology (CPath) tasks, such as Whole Slide Image (WSI) classification and survival prediction. These models utilize large-scale datasets to align images and text by incorporating language priors during pre-training. However, the separate training of text and vision encoders in current VLMs leads to only coarse-level alignment, failing to capture the fine-level dependencies between image-text pairs. This limitation restricts their generalization in many downstream CPath tasks. In this paper, we propose a novel approach that enhances the capture of finer-level context through language priors, which better represent the fine-grained tissue morphological structures in histology images. We propose a Context-guided Vision-Language Model (ConVLM) that generates contextually relevant visual embeddings from histology images. ConVLM achieves this by employing context-guided token learning and token enhancement modules to identify and eliminate contextually irrelevant visual tokens, refining the visual representation. These two modules are integrated into various layers of the ConVLM encoders to progressively learn context-guided visual embeddings, enhancing visual-language interactions. The model is trained end-to-end using a context-guided token learning-based loss function. We conducted extensive experiments on 20 histopathology datasets, evaluating both Region of Interest (ROI)-level and cancer subtype WSI-level classification tasks. The results indicate that ConVLM significantly outperforms existing State-of-the-Art (SOTA) vision-language and foundational models. Our source code and pre-trained model is publicly available on: https://github.com/BasitAlawode/ConVLM
- Research Article
- 10.1016/j.eswa.2025.130064
- Mar 1, 2026
- Expert Systems with Applications
- Zhiwen Wang + 7 more
DSA mamba: A model for advanced medical image classification
- Research Article
- 10.1016/j.rineng.2025.108947
- Mar 1, 2026
- Results in Engineering
- Wenqiang Hua + 2 more
Knowledge and data co-driven deep learning model for PolSAR image classification