Enhancing Image Classification Performance Using Multi CNN Feature Fusion Method
This research aims to overcome general challenges in the field of image pattern recognition using a convolutional neural network (CNN), which is still faced with the complexity and limitations of image data. Achieving high accuracy is essential because it significantly influences the effectiveness and success of numerous areas. Although deep learning technology, especially CNNs, offers the potential to improve accuracy, it is still limited to the 70–80% range for achieving the expected level of accuracy. In this research, a fusion method was developed that combines pre-trained models using concatenation techniques to increase accuracy. By utilizing pre-trained models such as ResNet50, VGG16, and MobileNet-v2, which were then adapted to various datasets and cross-validation techniques, researchers managed to achieve significant improvements in accuracy. The results of this study show an improvement in the accuracy of the Fusion Multi-CNN model for various datasets. On the fashion dataset, MNIST managed to achieve an accuracy of 0.87840, while on CIFAR-10 and Oxford-102, the accuracy was 0.81260 and 0.84004, respectively.
- Book Chapter
2
- 10.1007/978-981-13-8311-3_13
- Jul 17, 2019
Deep Convolutional Neural Networks (CNNs) as well as transfer learning using their pre-trained models often find applications in image classification tasks. In this paper, we explore the utilization of pre-trained CNNs for identifying images containing ladders. We target a particular use case, where an insurance firm, in order to decide the price for workers’ compensation insurance for its client companies, would like to assess the risk involved in their workplace environments. For this, the workplace images provided by the client companies can be utilized and the presence of ladders in such images can be considered as a workplace hazard and therefore an indicator of risk. To this end, we explore the utilization of pre-trained CNN models: VGG-16 and VGG-19, to extract features from images in a training set, that in turn are used to train a binary classifier (classifying an image as ladder and no ladder). The trained binary classifier can then be used for future predictions. Moreover, we explore the effect of including standard image augmentation techniques to enrich the training set. We also explore improving classification predictions by combining predictions generated by two individual binary classifiers that utilize features obtained from pre-trained VGG-16 and VGG-19 models. Our experimental results compare accuracies of classifiers that utilize features obtained using pre-trained VGG-16 and VGG-19 models. Furthermore, we analyze improvements in accuracies achieved on using image augmentation techniques as well as on combining predictions from VGG-16 and VGG-19 transfer learning based binary classifiers.
- Research Article
37
- 10.1108/jedt-04-2021-0192
- Aug 16, 2021
- Journal of Engineering, Design and Technology
PurposeThis paper aims to Test the capabilities/accuracies of four deep learning pre trained convolutional neural network (CNN) models to detect and classify types of highway cracks, as well as developing a new CNN model to maximize the accuracy at different learning rates.Design/methodology/approachA sample of 4,663 images of highway cracks were collected and classified into three categories of cracks, namely, “vertical cracks,” “horizontal and vertical cracks” and “diagonal cracks,” subsequently, using “Matlab” to classify the sample to training (70%) and testing (30%) to apply the four deep learning CNN models and compute their accuracies. After that, developing a new deep learning CNN model to maximize the accuracy of detecting and classifying highway cracks and testing the accuracy using three optimization algorithms at different learning rates.FindingsThe accuracies result of the four deep learning pre-trained models are above the averages between top-1 and top-5 and the accuracy of classifying and detecting the samples exceeded the top-5 accuracy for the pre-trained AlexNet model around 3% and by 0.2% for the GoogleNet model. The accurate model here is the GoogleNet model as the accuracy is 89.08% and it is higher than AlexNet by 1.26%. While the computed accuracy for the new created deep learning CNN model exceeded all pre-trained models by achieving 97.62% at a learning rate of 0.001 using Adam’s optimization algorithm.Practical implicationsThe created deep learning CNN model will enable users (e.g. highway agencies) to scan a long highway and detect types of cracks accurately in a very short time compared to traditional approaches.Originality/valueA new deep learning CNN-based highway cracks detection was developed based on testing four pre-trained CNN models and analyze the capabilities of each model to maximize the accuracy of the proposed CNN.
- Research Article
5
- 10.1016/j.heliyon.2023.e22242
- Nov 1, 2023
- Heliyon
In order to integrate the concept of intangible cultural heritage (ICH) protection into the construction of smart cities, realize the organic integration of smart cities and cultural heritage, and improve the cultural experience of urban residents and tourists, this study explores an interactive design scheme of smart cities application interface applied to ICH protection to meet the needs of protection and inheritance. Firstly, the ICH of Chongqing is sorted out and classified. The ICH-related APP interfaces in the market are analyzed through investigation. Secondly, an image recognition algorithm of ICH based on deep learning (DL) technology is proposed and applied in APP to realize automatic recognition and introduction of ICH. Finally, a set of APP interface interaction design schemes is designed based on user habits and visual feelings to enhance user experience. The experimental results reveal: (1) The model for recognizing ICH images using the convolutional neural network (CNN) has higher recognition accuracy, recall, and F1 value than the model without CNNs; (2) After incorporating transfer learning (TL) into the model, the recognition accuracy, recall, and F1 value of the model have further improved; (3) The survey results show that the Chongqing ICH APP interface system based on DL technology, user habits, and visual perception performs better in terms of user experience, usability, and other aspects. This study aims to design an APP interface system for the Chongqing ICH based on DL technology, user habits, and visual perception, to improve user experience and usability. Future research directions can further optimize image recognition algorithms to improve ICH's recognition accuracy and efficiency. Meanwhile, new technologies, such as virtual reality, are combined to enhance users' interactive experience and immersion.
- Research Article
5
- 10.1016/j.ecoinf.2023.102363
- Nov 7, 2023
- Ecological Informatics
Convolutional neural networks (CNNs) have the potential to enable a revolution in bioacoustics, allowing robust detection and classification of marine sound sources. As global Passive Acoustic Monitoring (PAM) datasets continue to expand it is critical we improve our confidence in the performance of models across different marine environments, if we are to exploit the full ecological value of information within the data. This work demonstrates the transferability of developed CNN models to new acoustic environments by using a pre-trained model developed for one location (West of Scotland, UK) and deploying it in a distinctly different soundscape (Gulf of Mexico, USA). In this work transfer learning is used to fine-tune an existing open-source ‘small-scale’ CNN, which detects odontocete tonal and broadband call types and vessel noise (operating between 0 and 48 kHz). The CNN is fine-tuned on training sets of differing sizes, from the unseen site, to understand the adaptability of a network to new marine acoustic environments. Fine-tuning with a small sample of site-specific data significantly improves the performance of the CNN in the new environment, across all classes. We demonstrate an improved performance in area-under-curve (AUC) score of 0.30, across four classes by fine-training with only 50 spectrograms per class, with a 5% improvement in accuracy between 50 frames and 500 frames. This work shows that only a small amount of site-specific data is needed to retrain a CNN, enabling researchers to harness the power of existing pre-trained models for their own datasets. The marine bioacoustic domain will benefit from a larger pool of global data for training large deep learning models, but we illustrate in this work that domain adaptation can be improved with limited site-specific exemplars.
- Research Article
71
- 10.1016/j.compbiomed.2022.105383
- Mar 10, 2022
- Computers in Biology and Medicine
An automated diagnosis and classification of COVID-19 from chest CT images using a transfer learning-based convolutional neural network
- Conference Article
5
- 10.1109/irc55401.2022.00009
- Dec 1, 2022
In recent years, the technology behind Unmanned Aerial Vehicles (UAVs) has continually advanced. However, with these developments, malicious activities employing UAVs have also been on the rise. Within this study, Deep Learning (DL) algorithms are utilized to detect and classify UAVs transporting payload based on the sound they release. In order to exercise DL algorithms on a set of data, a sufficient amount of audio data is necessary to obtain a more reliable result. So UAV sound recordings have been collected alongside the use of data augmentation to secure a satisfactory sample size for testing purposes. Afterward, a feature-based classification was applied to the groups of audio identifying each UAV’s payload (or lack thereof). Lastly, Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), and Convolutional Recurrent Neural Network(CRNN) are utilized in analyzing the final data-set. They are evaluated for their abilities to correctly categorize the unloaded, one payload, and two payload of UAV classes and noise class solely through audio. As a result, MFCC showed the best performance in CNN, RNN, and CRNN, which are 0.9493, 0.8133, and 0.9174 accuracies. Our contribution to this study is that a cost-efficient data collection method was applied by utilizing laptop microphones. Moreover, DL technology was used in UAV payload detection, whereas neural network was used in prior study. Also, the best feature for UAV payload detection with the three DL technologies was found. The limitation of the paper is that only two UAV models and one kind of payload were used to collect data. Diverse UAVs and payload are expected to be used to collect data in future works.
- Conference Article
1
- 10.1109/aipr.2018.8707371
- Oct 1, 2018
In recent years, Deep Convolutional Neural Networks (DCNNs) have gained lots of attention and won many competitions in machine learning, object detection, image classification, and pattern recognition. The breakthroughs in the development of graphical processing units have made it possible to train DCNNs quickly for state-of-the-art tasks such as image classification, speech recognition, and many others. However, to solve complex problems, these multilayered convolutional neural networks become increasingly large, complex, and abstract. We propose methods to improve the performance of neural networks while reducing their dimensionality, enabling a better understanding of the learning process. To leverage the extensive training, as well as strengths of several pretrained models, we explored new approaches for combining features from fully connected layers of models with heterogeneous architectures. The proposed approach combines features extracted from the penultimate fully connected layer from three different DCNNs. We merge the features of all three DCNNs together and apply principal component analysis or linear discriminant analysis. Our approach aims to reduce the dimensionality of the feature vector and find the smallest feature vector dimension that can maintain the classifier performance. For this task we use a linear Support Vector Machine as a classifier. We also investigate whether it is advantageous to fuse only penultimate fully connected layers, or to perform fusion based on other fully connected layers using multiple homogenous or heterogeneous networks. The results show that the fusion method outperformed both individual networks in terms of accuracy and computational time in all of our various trial sizes. Overall our fusion methods are faster and more accurate than individual networks in both training and testing. Finally, we compared heterogeneous with homogenous fusion methods and the results show heterogeneous methods outperform homogeneous methods.
- Conference Article
21
- 10.1109/icramet51080.2020.9298575
- Nov 18, 2020
It is well-known that a large amount of data is required to train deep learning systems. However, data collection is very costly if it is not impossible to do. To overcome the limited data problem, one can use models that have been trained with a large dataset and apply them in the target domain with a limited dataset. In this paper, we use pre-trained models on imageNet data and re-train them on our data to detect tea leaf diseases. Those pre-trained models use deep convolutional neural network (DCNN) architectures: VGGNet, ResNet, and Xception. To mitigate the difference tasks of ImageNet and ours, we apply fine-tuning on the pre-trained models by replacing some parts of the pre-trained models with new structures. We evaluate the performance using various re-training and fine-tuning schema. The vanilla pre-trained model is used as the baseline while other techniques such as re-training the models on the appended structures, partially re-training the pre-trained models, and fully re-training the whole networks where the pre-trained models are used in the initialization as the evaluator. Our experiments show that applying transfer learning only on our data may not be effective due to the difference in our task to ImageNet. Applying fine-tuning on pre-trained DCNN models is found to be effective. It is consistently better than that of using transfer learning only or partial fine-tuning. It is also better than training the model from scratch, i.e., without using pre-trained models.
- Research Article
44
- 10.1002/cpe.6767
- Dec 13, 2021
- Concurrency and Computation: Practice and Experience
At present, in the age of computers and automation of services, deep learning (DL) technology, mainly the subset of machine learning (ML) and artificial intelligence (AI), is expressively used in innumerable domains of computer vision such as data analysis, image recognition, classification, natural language processing, and many more. It has become the foremost choice of researchers as of its effectiveness in producing decent results. This paper presents detailed and analytical literature starting from the very elementary level to the recent trends of this trending technology while focusing on the most used DL model, that is, convolutional neural network and its pretrained models for image classification and object detection. It also reviews diverse existing current literature based on this. Further, a brief introduction of AI, ML, and DL has also been presented, making the foundation for the readers. As pretrained models continuously give an upper edge to DL over ML and other technologies, 23 most popular pretrained models with their architectural diagrams have also been presented. This paper aims to summarize and analyze all the concepts used to formulate DL and its models. Also, we have emphasized more on the GoogleNet models and the entire Inception modules in detail. Finally, the fascinating applications and discussion on integral components of DL have been presented. This paper will definitely draw the attention of the students and researchers working in the area of DL and its models.
- Research Article
211
- 10.1016/j.swevo.2019.100616
- Nov 15, 2019
- Swarm and Evolutionary Computation
An optimized model based on convolutional neural networks and orthogonal learning particle swarm optimization algorithm for plant diseases diagnosis
- Conference Article
- 10.1117/12.2605982
- Nov 24, 2021
The surface ship target recognition technology based on visual perception is an important research direction in the development of maritime unmanned systems, for the reason that it is the main technical means to ensure the reliable completion of tasks of maritime unmanned systems such as the shipborne unmanned aerial vehicle or the unmanned surface vehicle. In recent years, deep learning technology, especially the deep convolutional neural network, performs well in image classification, target recognition and other tasks, introducing it into the ship target recognition field will promote the breakthrough of ship target recognition technology. Many researchers have introduced the deep convolutional neural networks into the field of ship target recognition and achieved good recognition results. However, due to the fixed position sampling mode of convolution operation and the limitation of the receptive field range in the convolutional neural network, the convolutional neural network generally only extracts the feature information related to the target itself, ignoring the interaction information between different targets and between the target and the scene, thus it has poor adaptability to objects’ spatial geometric transformation, which will affect the recognition performance of ship targets with different scales and different heading directions under occlusion. The human visual perception system can recognize the target quickly and accurately when faced with target scale changes, brightness changes, shape changes, and target occlusion, which largely depends on its inherent visual attention mechanism. Aiming at the problem that the performance of the ship target recognition method based on the convolutional neural network is greatly reduced in the occlusion situation, a convolutional neural network model based on the biological visual attention mechanism was constructed, which can recognize the ship targets with different scales and different heading directions under occlusion quickly and effectively. The model used the residual module with dilated convolution to expand the receptive field of the high-level convolution kernels in the basic feature extraction module and integrate more contextual information into the high-level features. The visual attention module quickly extracted features which were highly related to the target and the current task, thus improving the efficiency and enhancing the model’s adaptability to the geometric transformation of the target space. The multi-scale feature fusion module enhanced the features’ comprehensive expression ability, improved the model’s adaptability to the target scale transformation, and reduced the calculation amount of target location and category prediction. The non-maximum suppression algorithm used the re-scoring mechanism to improve the accuracy of target location and category prediction. The ship target recognition results in the ship target test set with different scales and different heading directions under occlusion which obtained by the proposed method and those of the four mainstream methods based on convolutional neural network were compared, the comparison results show that the average ship target recognition accuracy of the ship target recognition method based on biological visual attention mechanism is improved by 17.51% when comparing with the method which has the highest average recognition accuracy within the four mainstream target recognition methods, its average ship target recognition accuracy reaches to 87.69% which shows strong robustness, the recognition rate meets the real-time requirements meanwhile. The above results show that the proposed method effectively solves the problems of poor adaptability to spatial geometric transformation and loss of valid information caused by the fixed position sampling mode of convolution operation and the limitation of the receptive field range.
- Research Article
21
- 10.1109/access.2022.3192857
- Jan 1, 2023
- IEEE Access
Breast cancer is the second most deadly type of cancer globally among women and can be preventable to a great extent in the case of early detection. Research scientists have conducted several experiments to develop tools to alleviate this problem in order to raise the survival rate, including Computer-Aided Diagnosis (CADx) systems. Deep Learning and its important sub-field Convolutional Neural Networks (CNN)s have revolutionized (CADx) development research. While the Curated Breast Imaging Subset of Digital Database for Screening Mammography, or the CBIS-DDSM dataset, has been classified using different pre-trained architectures, few of them have used ensemble learning to provide a more robust and accurate architecture. To the best of our knowledge, we are the first to integrate the application of the state-of-the-art pre-trained model called EfficientNet along with other pre-trained models for the part, and subsequently, the models were concatenated (ensembled). With the application of pre-trained CNN-based models, we are able to address the problem of not having a large dataset. Nevertheless, with the EfficientNet family offering better results with fewer parameters, we obtained significant improvement in accuracy, and later ensemble learning was applied to provide robustness for the network. After performing 10-fold crossvalidation, our experiments yielded promising test accuracy results, 96.05% and 85.71% for abnormality type and pathology diagnosis classification, respectively.
- Research Article
8
- 10.3390/rs14235986
- Nov 25, 2022
- Remote Sensing
Ship classification based on high-resolution synthetic aperture radar (SAR) imagery plays an increasingly important role in various maritime affairs, such as marine transportation management, maritime emergency rescue, marine pollution prevention and control, marine security situational awareness, and so on. The technology of deep learning, especially convolution neural network (CNN), has shown excellent performance on ship classification in SAR images. Nevertheless, it still has some limitations in real-world applications that need to be taken seriously by researchers. One is the insufficient number of SAR ship training samples, which limits the learning of satisfactory CNN, and the other is the limited information that SAR images can provide (compared with natural images), which limits the extraction of discriminative features. To alleviate the limitation caused by insufficient training datasets, one of the widely adopted strategies is to pre-train CNNs on a generic dataset with massive labeled samples (such as ImageNet) and fine-tune the pre-trained network on the target dataset (i.e., a SAR dataset) with a small number of training samples. However, recent studies have shown that due to the different imaging mechanisms between SAR and natural images, it is hard to guarantee that the pre-trained CNNs (even if they perform extremely well on ImageNet) can be finely tuned by a SAR dataset. On the other hand, to extract the most discriminative ship representation features from SAR images, the existing methods have carried out fruitful research on network architecture design, attention mechanism embedding, feature fusion, etc. Although these efforts improve the performance of SAR ship classification to some extent, they are usually based on more complex network architecture and higher dimensional features, accompanied by more time-consuming storage expenses. Through the analysis of SAR image characteristics and CNN feature extraction mechanism, this study puts forward three hypotheses: (1) Pre-training CNN on a task-specific dataset may be more effective than that on a generic dataset; (2) a shallow CNN may be more suitable for SAR image feature extraction than a deep one; and (3) the deep features extracted by CNNs can be further refined to improve the feature discrimination ability. To validate these hypotheses, we propose to learn a shallow CNN which is pre-trained on a task-specific dataset, i.e., the optical remote sensing ship dataset (ORS) instead of on the widely adopted ImageNet dataset. For comparison purposes, we designed 28 CNN architectures by changing the arrangement of the CNN components, the size of convolutional filters, and pooling formulations based on VGGNet models. To further reduce redundancy and improve the discrimination ability of the deep features, we propose to refine deep features by active convolutional filter selection based on the coefficient of variation (COV) sorting criteria. Extensive experiments not only prove that the above hypotheses are valid but also prove that the shallow network learned by the proposed pre-training strategy and the feature refining method can achieve considerable ship classification performance in SAR images like the state-of-the-art (SOTA) methods.
- Research Article
12
- 10.1016/j.ijpx.2022.100135
- Oct 18, 2022
- International Journal of Pharmaceutics: X
Classification of scanning electron microscope images of pharmaceutical excipients using deep convolutional neural networks with transfer learning
- Research Article
2
- 10.1118/1.4957862
- Jun 1, 2016
- Medical Physics
Purpose:To develop a deep convolutional neural network (DCNN)‐based computer‐aided diagnosis (CAD) system for detecting the masses in digital mammographic images.Methods:A DCNN architecture, which consists of 5 convolutional layers and 3 fully connected layers, is constructed in this study. The DCNN parameters are then trained by the following two procedures. We first train the DCNN using about 1.3 million natural images for classification of 1,000 categories. Then, we modify the last fully connected layer and subsequently train the modified DCNN using 1,656 mammographic region of interest (ROI) images for two categories classification: mass and normal.Results:The trained DCNN is tested by using 198 mammographic ROI images including 99 mass images and 99 normal images. The experimental results show that the sensitivity of the mass detection is about 89.9% and the false positive is 19.2%. These results demonstrated that the DCNN has a potential for mammographic CAD.Conclusion:In recent years, the DCNN, as one of the most successful techniques in deep learning technology, made a remarkable impact on image recognition application. For medical image recognition, however, its performance is uncertainty because collecting a large amount of training image data for a particularly medical image modality is difficult. In this study, our preliminary experiments demonstrated a feasibility to apply the DCNN in mammographic CAD system. To the best of our knowledge, this study is also the first demonstration of DCNN for detecting the masses in mammographic images.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.