Deep Convolutional Neural Networks for Image Classification: A Comprehensive Review.
Convolutional neural networks (CNNs) have been applied to visual tasks since the late 1980s. However, despite a few scattered applications, they were dormant until the mid-2000s when developments in computing power and the advent of large amounts of labeled data, supplemented by improved algorithms, contributed to their advancement and brought them to the forefront of a neural network renaissance that has seen rapid progression since 2012. In this review, which focuses on the application of CNNs to image classification tasks, we cover their development, from their predecessors up to recent state-of-the-art deep learning systems. Along the way, we analyze (1) their early successes, (2) their role in the deep learning renaissance, (3) selected symbolic works that have contributed to their recent popularity, and (4) several improvement attempts by reviewing contributions and challenges of over 300 publications. We also introduce some of their current trends and remaining challenges.
- Research Article
29
- 10.1016/j.jid.2020.07.034
- Sep 12, 2020
- Journal of Investigative Dermatology
Clinically Relevant Vulnerabilities of Deep Machine Learning Systems for Skin Cancer Diagnosis
- Research Article
5
- 10.21271/zjpas.34.2.3
- Apr 12, 2022
- ZANCO JOURNAL OF PURE AND APPLIED SCIENCES
Comprehensive Study for Breast Cancer Using Deep Learning and Traditional Machine Learning
- Research Article
19
- 10.13052/2245-1439.825
- Jan 17, 2018
- Journal of Cyber Security and Mobility
The impressive gain in performance obtained using deep neural networks (DNN) for various tasks encouraged us to apply DNN for image classification task. We have used a variant of DNN called Deep convolutional Neural Networks (DCNN) for feature extraction and image classification. Neural networks can be used for classification as well as for feature extraction. Our whole work can be better seen as two different tasks. In the first task, DCNN is used for feature extraction and classification task. In the second task, features are extracted using DCNN and then SVM, a shallow classifier, is used to classify the extracted features. Performance of these tasks is compared. Various configurations ofDCNNare used for our experimental studies.Among different architectures that we have considered, the architecture with 3 levels of convolutional and pooling layers, followed by a fully connected output layer is used for feature extraction. In task 1 DCNN extracted features are fed to a 2 hidden layer neural network for classification. In task 2 SVM is used to classify the features extracted by DCNN. Experimental studies show that the performance of υ-SVM classification on DCNN features is slightly better than the results of neural network classification on DCNN extracted features.
- Research Article
169
- 10.1142/s0219530518500124
- Nov 1, 2018
- Analysis and Applications
Deep learning based on structured deep neural networks has provided powerful applications in various fields. The structures imposed on the deep neural networks are crucial, which makes deep learning essentially different from classical schemes based on fully connected neural networks. One of the commonly used deep neural network structures is generated by convolutions. The produced deep learning algorithms form the family of deep convolutional neural networks. Despite of their power in some practical domains, little is known about the mathematical foundation of deep convolutional neural networks such as universality of approximation. In this paper, we propose a family of new structured deep neural networks: deep distributed convolutional neural networks. We show that these deep neural networks have the same order of computational complexity as the deep convolutional neural networks, and we prove their universality of approximation. Some ideas of our analysis are from ridge approximation, wavelets, and learning theory.
- Supplementary Content
4
- 10.17635/lancaster/thesis/407
- Jan 1, 2018
- University of Lancaster
Machine learning, as a subarea of artificial intelligence, is widely believed to reshape the human world in the coming decades. This thesis is focused on both the unsupervised and supervised self-organising transparent machine learning techniques. One particularly interesting aspect is the transparent self-organising deep learning systems. Traditional data analysis approaches and most of the machine learning algorithms are built upon the basis of probability theory and statistics. The solid mathematical foundation of the probability theory and statistics guarantees the good properties of these learning algorithms when the amount of data tends to infinity and all the data comes from the same distribution. However, the prior assumptions of the random nature and same distribution imposed on the data generation model are often too strong and impractical in real applications. Moreover, traditional machine learning algorithms also require a number of free parameters to be predefined. However, without any prior knowledge of the problem, which is often the case in real situations, the performance of the algorithms can be largely influenced by the improper choice. Deep learning-based approaches are currently the state-of-the-art techniques in the fields of machine learning and computer vision. However, they are also suffering from a number of deficiencies including the computational burden of training using huge amount of data, lack of transparency and interpretation, ad hoc decisions about the internal structure, no proven convergence for the adaptive versions that rely on reinforcement learning, limited parallelisation and offline training, etc. These shortcomings largely all hinder the wider applications of the deep learning in real situations. The novel approaches presented in this thesis are developed within the Empirical Data Analytics framework, which is an alternative, but more advanced computational methodology to the traditional approaches based on the ensemble properties and mutual distribution of the empirical discrete observations. The novel self-organising transparent machine learning algorithms presented in this work for clustering, regression, classification and anomaly detection are autonomous, self-organising, data-driven and free from user- and problem- specific parameters. They do not impose any data generation models on the data a priori, but are driven by the empirically observed data and are able to produce the objective results without prior knowledge of the problems. In addition, they are highly efficient and suitable for large-scale static/streaming data processing. The newly proposed self-organising transparent deep learning systems are able to achieve human-level performance comparable to or even better than the deep convolutional neural networks on image classification problems with the merits of being fully transparent, self-evolving, highly efficient, parallelisable and human-interpretable. More importantly, the proposed deep learning systems have the ability of starting classification from the very first image of each class in the same way as humans do. Numerical examples based on numerous challenging benchmark problems and comparisons conducted with the state-of-the-art approaches presented in this thesis demonstrated the validity and effectiveness of the proposed new machine learning algorithms and deep learning systems and show their potential for real applications.
- Addendum
10
- 10.1016/j.matpr.2021.02.186
- Mar 1, 2021
- Materials Today: Proceedings
WITHDRAWN: Role of convolutional neural networks for any real time image classification, recognition and analysis
- Research Article
- 10.18282/jnt.v2i2.886
- Aug 6, 2020
- Journal of Networking and Telecommunications
<p>As an important research achievement in the field of brain like computing, deep convolution neural network has been widely used in many fields such as computer vision, natural language processing, information retrieval, speech recognition, semantic understanding and so on. It has set off a wave of neural network research in industry and academia and promoted the development of artificial intelligence. At present, the deep convolution neural network mainly simulates the complex hierarchical cognitive laws of the human brain by increasing the number of layers of the network, using a larger training data set, and improving the network structure or training learning algorithm of the existing neural network, so as to narrow the gap with the visual system of the human brain and enable the machine to acquire the capability of "abstract concepts". Deep convolution neural network has achieved great success in many computer vision tasks such as image classification, target detection, face recognition, pedestrian recognition, etc. Firstly, this paper reviews the development history of convolutional neural networks. Then, the working principle of the deep convolution neural network is analyzed in detail. Then, this paper mainly introduces the representative achievements of convolution neural network from the following two aspects, and shows the improvement effect of various technical methods on image classification accuracy through examples. From the aspect of adding network layers, the structures of classical convolutional neural networks such as AlexNet, ZF-Net, VGG, GoogLeNet and ResNet are discussed and analyzed. From the aspect of increasing the size of data set, the difficulties of manually adding labeled samples and the effect of using data amplification technology on improving the performance of neural network are introduced. This paper focuses on the latest research progress of convolution neural network in image classification and face recognition. Finally, the problems and challenges to be solved in future brain-like intelligence research based on deep convolution neural network are proposed.</p>
- Discussion
8
- 10.1016/j.ejmp.2021.05.008
- Mar 1, 2021
- Physica Medica
Focus issue: Artificial intelligence in medical physics.
- Research Article
26
- 10.1109/tevc.2022.3225591
- Aug 1, 2024
- IEEE Transactions on Evolutionary Computation
Deep convolutional neural networks have become a dominant solution for numerous image classification tasks. However, a main criticism is the poor explainability due to the black-box characteristic, which hurdles the extensive usage of deep convolutional neural networks. To address this issue, this paper proposes a new evolutionary multi-objective based method, which aims to explain the behaviours of deep convolutional neural networks by evolving local explanations on specific images. To the best of our knowledge, this is the first evolutionary multi-objective method to evolve local explanations. The proposed method is model-agnostic, i.e. it is applicable to explain any deep convolutional neural networks. ImageNet is used to examine the effectiveness of the proposed method. Three well-known deep convolutional neural networks -VGGNet, ResNet, and MobileNet, are chosen to demonstrate the modelagnostic characteristic. Based on the experimental results, it can be observed that the local explanations are understandable to end-users, who need to check the sensibility of the evolved explanations to decide whether to trust the predictions made by the deep convolutional neural networks. Furthermore, the local explanations evolved by the proposed method improves the confidence of deep convolutional neural networks making the predictions. Lastly, the pareto front and convergence analyses indicate that the proposed method can form a good set of nondominated solutions.
- Research Article
11
- 10.1007/s11042-021-10916-x
- Apr 17, 2021
- Multimedia Tools and Applications
Accurate food image classification is often critical to accurately monitor the dietary assessment to reduce the risk of different heart-related diseases, obesity, diabetes, and other related health conditions. The accuracy and efficiency of image classification results when using traditional deep learning methods were less than optimal. This research aimed at enhancing the classification and prediction accuracy of food images and reducing the processing time by using the Deep Convolutional Neural Network (DCN) algorithm. The solution starts by using the Modified Loss function, the images are fed into the DCN for features extraction through alternating between convolutional layers and pooling layers, then this is followed by a fully connected layer. Finally, the Softmax function is used to classify the images. The result was compared during the classification phase in the DCN. The proposed solution enhanced the accuracy of the classification by using the regularized loss function and lowered the processing time by decreasing the weights of the neurons in the neural network. Probability score is used as the evaluation metric for the accuracy, and total execution time is used as the evaluation metric for the speed of the algorithm. The combination of deep neural network with regularized cross entropy cost function has improved the fast-food images classification by ahcieving better processing time by 40 ~ 50s and accuracy by 5% in average.
- Research Article
- 10.30534/ijatcse/2022/011122022
- Nov 2, 2022
- International Journal of Advanced Trends in Computer Science and Engineering
With the development of big data information and success in computer vision problems, more hidden layers in CNNs give it a greater and complicated structure and more powerful characteristic. Convolutional Neural Networks (CNN) provide an opportunity for automatically gaining knowledge of the domain specific features. The convolutional neural network is model and skilled by means of the deep leaning of neural networks and the set of rules has made great achievements in computer vision considering the fact that it’s a creation. This paper first explains the upward push and structure of deep learning and convolution neural network (CNN), and summarizes the structure or shape of CNN, and its different operations like convolution, feature extraction and pooling operation of convolution neural network. Development of convolution neural network model primarily based on deep learning in image classification are reviewed, an intensive literature survey of Convolution Neural Networks which is the broadly used framework of deep learning. With Alex Net or ImageNet because the base model of image classification in CNN model, we've got reviewed all the versions emerged over the years to fit various programs and a small discussion on structure and working of CNN.
- Research Article
12
- 10.1007/s44163-022-00035-3
- Oct 3, 2022
- Discover Artificial Intelligence
The practice of using deep learning methods in safety critical vision systems such as autonomous driving has come a long way. As vision systems supported by deep learning methods become ubiquitous, the possible security threats faced by these systems have come into greater focus. As it is with any artificial intelligence system, these deep neural vision networks are first trained on a data set of interest, once they start performing well, they are deployed to a real-world environment. In the training stage, deep learning systems are susceptible to data poisoning attacks. While deep neural networks have proved to be versatile in solving a host of challenges. These systems have complex data ecosystems especially in computer vision. In practice, the security threats when training these systems are often ignored while deploying these models in the real world. However, these threats pose significant risks to the overall reliability of the system. In this paper, we present the fundamentals of data poisoning attacks when training deep learning vision systems and discuss countermeasures against these types of attacks. In addition, we simulate the risk posed by a real-world data poisoning attack on a deep learning vision system and present a novel algorithm MOVCE—Model verification with Convolutional Neural Network and Word Embeddings which provides an effective countermeasure for maintaining the reliability of the system. The countermeasure described in this paper can be used on a wide variety of use cases where the risks posed by poisoning the training data are similar.
- Book Chapter
28
- 10.1007/978-3-030-11821-1_12
- Jan 1, 2019
Deep learning (DL) methods have gained considerable attention since 2014. In this chapter we briefly review the state of the art in DL and then give several examples of applications from diverse areas of application. We will focus on convolutional neural networks (CNNs), which have since the seminal work of Krizhevsky et al. (ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems 25, pp. 1097–1105, 2012) revolutionized image classification and even started surpassing human performance on some benchmark data sets (Ciresan et al., Multi-column deep neural network for traffic sign classification, 2012a; He et al., Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification. CoRR, Vol. 1502.01852, 2015a). While deep neural networks have become popular primarily for image classification tasks, they can also be successfully applied to other areas and problems with some local structure in the data. We will first present a classical application of CNNs on image-like data, in particular, phenotype classification of cells based on their morphology, and then extend the task to clustering voices based on their spectrograms. Next, we will describe DL applications to semantic segmentation of newspaper pages into their corresponding articles based on clues in the pixels, and outlier detection in a predictive maintenance setting. We conclude by giving advice on how to work with DL having limited resources (e.g., training data).
- Research Article
1
- 10.54254/2755-2721/81/20241009
- Nov 8, 2024
- Applied and Computational Engineering
Abstract. This paper reviews the application and improvement of convolutional neural networks (CNNs) in image classification. Firstly, a shallow CNN for interstitial lung disease image classification is presented. This model suppresses overfitting through a unique network architecture and optimisation algorithm. Next, the improved VGG16 architecture and MIDNet18 model are discussed and their superior performance in brain tumour image classification is demonstrated. Subsequently, a CNN-CapsNet model for cervical cancer image classification and its improvement are presented and the customised model is compared with the conventional VGG-16 CNN architecture in the paper. Next, the application of sparse convolutional kernels and hybrid sparse convolutional kernels (HDCs) in solving the problem of computational resource consumption is presented. Subsequently, methods for solving the problem of limited training data through transfer learning and network data augmentation techniques are discussed, as well as GAN-generated datasets for solving the overfitting problem. Finally, the effect of degraded images on the classification effectiveness of CNNs is explored. The results show that the improved CNN architecture and algorithms have significant effects in solving the problems of overfitting and computational resource consumption, and can significantly improve the accuracy and efficiency of image classification. And degraded images do adversely affect the accuracy of CNN for image classification.
- Research Article
2
- 10.3233/jcm-180871
- May 5, 2019
- Journal of Computational Methods in Sciences and Engineering
Image classification is an important research direction of computer vision. Convolutional neural network is a deep feedforward neural network model. It uses the deep learning idea and shows good performance in multiple image classification fields such as speech recognition, face recognition, motion analysis, and medical diagnosis. However, a single-structure convolutional neural network is prone to overfitting problems. The main reason for the overfitting problem is that the learning model overfits the training set and results in the lack of generalization performance, which affects the feature extraction and judgment of the test set. This paper presents a structure model for Multi-Column Heterogeneous Convolutional Neural Networks. Multi-Column Heterogeneous Convolutional Neural Networks are used in image classification. We construct several convolutional neural networks with different structures by setting different size of convolution kernels and different number of feature maps. Image features are learned from multiple perspectives. Each convolutional neural network model is trained on the training set, and the different network models are fitted to the training set. Finally, through the sliding window, the output of each network is fused to obtain a relatively better prediction result. Experiments show that Multi-Column Heterogeneous Convolutional Neural Networks reduce the overfitting problem to a certain extent, and the accuracy of object recognition is improved compared to the single structure convolutional neural network.