Dynamic Activation Function Based on the Branching Process and its Application in Image Classification
The choice of activation function in deep learning is crucial to the performance of neural networks. The activation function used in conventional deep learning remains unchanged for neural networks of different depths, leading to performance degradation as the depth of the model increases. In this paper, we propose a $\operatorname{sigmoid}_{n}$ dynamic activation function that can change with the depth of the neural network. We firstly introduced the dual relationship between the activation function and the probability generating function(PGF) from the perspective of the branching process, and explained the reason why the model performance of different activation functions decreases as the neural network deepens. Then, we use the law of large numbers in the super critical branching process to optimize the PGF and propose the sigmoid ${ }_{n}$ dynamic activation function through the dual relationship between the PGF and the activation function. Finally, to better extract the spatial context information of the image, we add a convolution channel based on the sigmoid ${ }_{n}$ dynamic activation function and propose a two-dimensional Fsigmoid ${ }_{n}$ dynamic activation function. Experiments on CIFAR-10 and CIFAR-100 datasets verify the superiority of the proposed sigmoid ${ }_{n}$ activation function.
- Research Article
1
- 10.54254/2755-2721/80/2024ch0068
- Nov 26, 2024
- Applied and Computational Engineering
As one of the important research directions in the field of computer vision, image classification has a wide range of applications. Image classification is an important foundation for pattern recognition, machine learning, and artificial intelligence. Image classification generally includes three steps: region of interest selection, feature extraction, and classifier modeling. Among them, feature extraction of images is an important foundation for completing other tasks. In most pattern recognition scenarios, appropriate feature representation is a crucial step that directly affects the performance of the entire classification system. Among them, the most representative is the deep learning method that directly learns complex feature expressions from massive data. This article first reviews the research background and development history of image classification, then analyzes the application of image classification in different fields and lists case studies. It also outlines traditional image classification methods and mainstream image classification techniques in deep learning, including convolutional neural networks, model pre-training, and data augmentation. At the end of the article, the main challenges currently faced by image classification were analyzed, and future directions for improvement and technologies that can be combined were discussed. This review aims to provide researchers in image classification with certain research directions.
- Research Article
2
- 10.1108/ijwis-05-2024-0135
- Jul 12, 2024
- International Journal of Web Information Systems
PurposeThe purpose of this study is to explore the potential of trainable activation functions to enhance the performance of deep neural networks, specifically ResNet architectures, in the task of image classification. By introducing activation functions that adapt during training, the authors aim to determine whether such flexibility can lead to improved learning outcomes and generalization capabilities compared to static activation functions like ReLU. This research seeks to provide insights into how dynamic nonlinearities might influence deep learning models' efficiency and accuracy in handling complex image data sets.Design/methodology/approachThis research integrates three novel trainable activation functions – CosLU, DELU and ReLUN – into various ResNet-n architectures, where “n” denotes the number of convolutional layers. Using CIFAR-10 and CIFAR-100 data sets, the authors conducted a comparative study to assess the impact of these functions on image classification accuracy. The approach included modifying the traditional ResNet models by replacing their static activation functions with the trainable variants, allowing for dynamic adaptation during training. The performance was evaluated based on accuracy metrics and loss profiles across different network depths.FindingsThe findings indicate that trainable activation functions, particularly CosLU, can significantly enhance the performance of deep learning models, outperforming the traditional ReLU in deeper network configurations on the CIFAR-10 data set. CosLU showed the highest improvement in accuracy, whereas DELU and ReLUN offered varying levels of performance enhancements. These functions also demonstrated potential in reducing overfitting and improving model generalization across more complex data sets like CIFAR-100, suggesting that the adaptability of activation functions plays a crucial role in the training dynamics of deep neural networks.Originality/valueThis study contributes to the field of deep learning by introducing and evaluating the impact of three novel trainable activation functions within widely used ResNet architectures. Unlike previous works that primarily focused on static activation functions, this research demonstrates that incorporating trainable nonlinearities can lead to significant improvements in model performance and adaptability. The introduction of CosLU, DELU and ReLUN provides a new pathway for enhancing the flexibility and efficiency of neural networks, potentially setting a new standard for future deep learning applications in image classification and beyond.
- Research Article
78
- 10.1109/access.2022.3208131
- Jan 1, 2022
- IEEE Access
The popularity of adapting deep neural networks (DNNs) in solving hard problems has increased substantially. Specifically, in the field of computer vision, DNNs are becoming a core element in developing many image and video classification and recognition applications. However, DNNs are vulnerable to adversarial attacks, in which, given a well-trained image classification model, a malicious input can be crafted by adding mere perturbations to misclassify the image. This phenomena raise many security concerns in utilizing DNNs in critical life applications which attracts the attention of academic and industry researchers. As a result, multiple studies have proposed discussing novel attacks that can compromise the integrity of state-of-the-art image classification neural networks. The raise of these attacks urges the research community to explore countermeasure methods to mitigate these attacks and increase the reliability of adapting DDNs in different major applications. Hence, various defense strategies have been proposed to protect DNNs against adversarial attacks. In this paper, we thoroughly review the most recent and state-of-the-art adversarial attack methods by providing an in-depth analysis and explanation of the working process of these attacks. In our review, we focus on explaining the mathematical concepts and terminologies of the adversarial attacks, which provide a comprehensive and solid survey to the research community. Additionally, we provide a comprehensive review of the most recent defense mechanisms and discuss their effectiveness in defending DNNs against adversarial attacks. Finally, we highlight the current challenges and open issues in this field as well as future research directions.
- Conference Article
3
- 10.1109/acssc.2014.7094400
- Nov 1, 2014
Recent studies have indicated the efficacy of selecting and combining the salient features from a pool of feature types in image retrieval and classification applications. In contrast to previous work, in this paper, we approach this problem as a selection and combination of the salient feature type(s) from a pool of feature types rather than selecting an individual feature. Our approach utilizes multiple kernels within the dictionary-learning framework where a combination of dictionary atoms represents individual categories. The category specific feature combination parameters or weights for kernel combination are determined by the mutual information techniques. The method is compared to a meta-algorithm for feature nomination. The multi-kernel dictionary learning method yields, on average, a 10% increase in classification accuracy with respect to the meta-algorithm in our preliminary experiments.
- Research Article
144
- 10.1016/j.jag.2021.102603
- Nov 6, 2021
- International Journal of Applied Earth Observation and Geoinformation
Hyperspectral image classification on insufficient-sample and feature learning using deep neural networks: A review
- Research Article
3
- 10.51662/jiae.v2i2.80
- Sep 24, 2022
- Journal of Integrated and Advanced Engineering (JIAE)
With the advancement of remote sensing technologies, hyperspectral imagery has garnered significant interest in the remote sensing community. These developments have inspired improvement in various hyperspectral images (HSI) classification applications, such as land cover mapping, amongst other earth observation applications. Deep Neural Networks have revolutionized image classification tasks in areas of computer vision. However, in the domain of hyperspectral images, insufficient training samples have been earmarked as a significant bottleneck for supervised HSI classification. Moreover, acquiring HSI from satellites and other remote sensors is expensive. Thus, researchers have turned to generative models to leverage the existing data to increase training samples, such as particularly generative adversarial networks (GAN). This paper explores the use of a vanilla GAN to generate synthetic data. The network employed in this paper was built using a deep learning python package, PyTorch and tested on a popular HSI dataset called Indian Pines dataset. The network achieved an overall accuracy of 64%. While promising, there is still room for improvement.
- Research Article
22
- 10.1109/tgrs.2021.3054037
- Mar 6, 2021
- IEEE Transactions on Geoscience and Remote Sensing
The deep learning-based method has shown promising competence in image classification. Its success can be attributed to the ability to learn discriminative feature representation given plenty of labeled data. However, in real-hyperspectral image (HSI) classification applications, since pixel labeling is difficult and costly, the labels we can obtain within an HSI are always limited and noisy (i.e., inaccurate), which consequently causes overfitting of the deep learning-based method. To address this problem, we propose a novel unified deep learning network to employ both labeled and unlabeled data for training, with which the unsupervised structure knowledge, e.g., intracluster similarity and intercluster dissimilarity, inherently contained in those unlabeled data can be exploited to boost the conventional supervised classification. Specifically, we first explore the unsupervised structure knowledge in unlabeled data via a clustering method and formulate a supervised clustering task on those data with the obtained cluster labels. Then, we propose a multitask network to jointly address both the conventional classification task and the formulated supervised clustering task. With a shared feature extraction module and a high-level feature fusion module, the unsupervised structure knowledge contained in unlabeled data can be effectively introduced into the classification task, which is beneficial to learn a more discriminative feature representation and, thus, well mitigates the overfitting problem and improves the classification results. Experimental results on three data sets demonstrate the proposed method can effectively label the unlabeled data within an HSI, especially when the training labels are limited and noisy.
- Research Article
64
- 10.1155/2022/3351256
- Jul 19, 2022
- Advances in Multimedia
Not only were traditional artificial neural networks and machine learning difficult to meet the processing needs of massive images in feature extraction and model training but also they had low efficiency and low classification accuracy when they were applied to image classification. Therefore, this paper proposed a deep learning model of image classification, which aimed to provide foundation and support for image classification and recognition of large datasets. Firstly, based on the analysis of the basic theory of neural network, this paper expounded the different types of convolution neural network and the basic process of its application in image classification. Secondly, based on the existing convolution neural network model, the noise reduction and parameter adjustment were carried out in the feature extraction process, and an image classification depth learning model was proposed based on the improved convolution neural network structure. Finally, the structure of the deep learning model was optimized to improve the classification efficiency and accuracy of the model. In order to verify the effectiveness of the deep learning model proposed in this paper in image classification, the relationship between the accuracy of several common network models in image classification and the number of iterations was compared through experiments. The results showed that the model proposed in this paper was better than other models in classification accuracy. At the same time, the classification accuracy of the deep learning model before and after optimization was compared and analyzed by using the training set and test set. The results showed that the accuracy of image classification had been greatly improved after the model proposed in this paper had been optimized to a certain extent.
- Conference Article
2
- 10.1109/iasp.2011.6109054
- Oct 1, 2011
Various types of orthogonal moments have been widely used for object recognition and classification. This paper presents an effective way of extracting texture features, Bessel Fourier moments, for image retrieval and classification applications. The Bessel Fourier moments are calculated for rotation invariance and perform better in terms of represent global features than orthogonal Fourier-Mellin and Zernike moments. In order to achieve good results in CBIR experiments and image classification experiments using Bessel Fourier moments, we conduct experiments on four databases: the first one is a small 2D texture database formed by 400 images, and the rest are 2D color image databases with different relevant images formed by 1200 from the Amsterdam Library of Object Image (ALOI). The experiments show that the feature descriptors extracting with the proposed algorithm perform better for image retrieval and classification than conventional descriptors by comparing the retrieval accuracy, the same order.
- Conference Article
2
- 10.1109/cei52496.2021.9574493
- Sep 24, 2021
In recent years, with the rapid development of computer software and hardware, it is becoming a growing number of popular using deep learning to process images. Therefore, many scholars also focus on the field of hyperspectral image (HSI) classification of deep neural networks. This article mainly introduces deep neural networks in HSI processing, including stacked auto-encoders, deep belief networks, and convolutional neural networks (CNNs). At the same time, due to the significant advantages of CNNs for HSI processing, this article also mainly summarizes the methods that scholars have used CNN for image classification over the year. Meanwhile, various classification networks related to the CNN architecture are summarized. After that, this article compares the advantages, disadvantages, and characteristics of different networks. Finally, combined with the existing problems, some future directions are proposed for HSI classification.
- Conference Article
1
- 10.1117/12.955699
- Dec 8, 1977
- Proceedings of SPIE, the International Society for Optical Engineering/Proceedings of SPIE
For many pattern classification and pattern recognition applications, the multispectral data is first used to obtain a classified image (map). This image is then used for different image data extraction and classifi-cation applications. It is important that a particular bandwidth compression method should not result in significant changes in the resulting classification map. In this article the performance of a hybrid encoder (Hadamard/DPCM) in retaining the classification accuracy of the classified image is evaluated. It is shown that using a Bayes supervised classifier the classification accuracy of the bandwidth compressed picture is actually higher than the original picture.© (1977) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.
- Research Article
- 10.61356/j.nois.2025.8646
- Dec 24, 2025
- Neutrosophic Optimization and Intelligent Systems
Image classification is a fundamental problem in the field of computer vision and involves assigning a label to an image based on its content. In this paper, we survey both traditional machine learning-based image classification methods and deep learning-based image classification methods, specifically, we review two deep learning-based image classification methods: a convolutional neural network (CNN) method and a pre-trained CNN-based transfer learning method for image classification. We first briefly review traditional machine learning-based image classification methods and then deep learning-based image classification methods, discussing both the feature extraction methods and the classification methods used in the deep learning-based methods. We discuss several deep neural network architectures for image classification, including LeNet, AlexNet, VGGNet, GoogleNet, ResNet, and DenseNet, and finally, we conclude with a discussion on the applications of image classification and compare the different methods based on various factors both qualitatively and quantitatively, and present our experimental results, while also summarizing various methods of data augmentation, batch normalization, and regularization by dropout. We introduce techniques for practical training of deep networks, and discuss fine-tuning, pruning, and model quantization for efficient inference of trained neural networks, and finally, we introduce and discuss the runtime engines for the deployment of trained neural networks for efficient inference.
- Conference Article
47
- 10.1109/itca52113.2020.00043
- Dec 1, 2020
Recently, deep learning is emerging as a powerful tool and has become a leading machine learning tool in computer vision and image analysis. In this survey paper, we provide a snapshot of this fast-growing field, image classification, specifically. We briefly introduce several popular neutral networks and summarize their applications in image classification. In addition, we also discuss the challenge of deep learning in image classification.
- Research Article
249
- 10.1117/1.oe.58.4.040901
- Apr 11, 2019
- Optical Engineering
In recent years, convolutional neural networks (CNNs) have been widely used in various computer visual recognition tasks and have achieved good results compared with traditional methods. Image classification is one of the basic and important tasks of visual recognition, and the CNN architecture applied to other visual recognition tasks (such as object detection, object localization, and semantic segmentation) is generally derived from the network architecture in image classification. We first summarize the development history of CNNs and then analyze the architecture of various deep CNNs in image classification. Furthermore, not only the innovation of the network architecture is beneficial to the results of image classification, but also the improvement of the network optimization method or training method has improved the result of image classification. We also analyze each of these methods’ effect. The experimental results of various methods are compared. Finally, the development of future CNNs is prospected.
- Research Article
1
- 10.54097/hset.v39i.6570
- Apr 1, 2023
- Highlights in Science, Engineering and Technology
Image classification technology processes and analyzes image data to extract valuable feature information to distinguish different types of images, thereby completing the process of machine cognition and understanding of image data. As the cornerstone of image application field, image classification technology involves a wide range of application fields. The class imbalance distribution is ubiquitous in the application of image classification and is one of the main problems in image classification research. This study summarizes the literature on class-imbalanced image classification methods in recent years, and analyzes the classification methods from both the data level and the algorithm level. In data-level methods, oversampling, under sampling and mixed sampling methods are introduced, and the performance of these literature algorithms is summarized and analyzed. The algorithm-level classification method is introduced and analyzed from the aspects of classifier optimization and ensemble learning. All image classification methods are analyzed in detail in terms of advantages, disadvantages and datasets.