Deep Learning Model of Image Classification Using Machine Learning
Not only were traditional artificial neural networks and machine learning difficult to meet the processing needs of massive images in feature extraction and model training but also they had low efficiency and low classification accuracy when they were applied to image classification. Therefore, this paper proposed a deep learning model of image classification, which aimed to provide foundation and support for image classification and recognition of large datasets. Firstly, based on the analysis of the basic theory of neural network, this paper expounded the different types of convolution neural network and the basic process of its application in image classification. Secondly, based on the existing convolution neural network model, the noise reduction and parameter adjustment were carried out in the feature extraction process, and an image classification depth learning model was proposed based on the improved convolution neural network structure. Finally, the structure of the deep learning model was optimized to improve the classification efficiency and accuracy of the model. In order to verify the effectiveness of the deep learning model proposed in this paper in image classification, the relationship between the accuracy of several common network models in image classification and the number of iterations was compared through experiments. The results showed that the model proposed in this paper was better than other models in classification accuracy. At the same time, the classification accuracy of the deep learning model before and after optimization was compared and analyzed by using the training set and test set. The results showed that the accuracy of image classification had been greatly improved after the model proposed in this paper had been optimized to a certain extent.
- Conference Article
1
- 10.1109/icbats57792.2023.10111398
- Mar 7, 2023
Traditional artificial neural networks and machine learning have traditionally been used to classify images; however, these approaches have not only struggled to keep up with the processing demands of such datasets but have also performed inefficiently and with poor classification accuracy due to the processing demands of enormous image datasets during feature extraction and model training. Large photo collections make it challenging for machine learning and traditional artificial neural networks to keep up with their processing demands. Deep learning significantly surpasses other, more traditional techniques for photo classification. Another difficulty was that when dealing with really large pictures, standard artificial neural networks were forced to exceed their data storage limits. As a direct result of their research, the authors of the study developed a deep-learning model for image tagging. This model was developed as a tool to aid in the identification and categorization of pictures on a large scale. Following a brief explanation of neural network theory, the study's focus shifted to an examination of the various convolutional neural network types and the general strategy for employing such networks for picture categorization. The present model of a convolutional neural network was utilized to minimize noise and alter the parameters used in the feature extraction operation. This was supposed to improve the results' dependability. An upgraded convolutional neural network was used to develop the model's architecture. The resulting deep learning model improved operational effectiveness and data categorization accuracy significantly. To achieve this purpose, the structure must be modified so that it can work as effectively as possible. Experiments were carried out to evaluate if the number of times an image classification network model is put through its training process influences the level of accuracy obtained by the model. This was done to see how effectively the proposed deep learning model could categorize different types of photos. The results demonstrate that, when compared to previous models, the classification accuracy of the model developed for this study has significantly improved. Prior to and during model tweaking, we compared the deep learning model's classification accuracy on the training set to that of the test set. The purpose of this study was to find any distinguishing characteristics that existed between the two groups. According to the findings of this study, some type of optimization would be extremely advantageous to increase the model's capacity to effectively categorize photographs. This is illustrated by the fact that classifications get more exact with time.
- Research Article
- 10.31449/inf.v49i16.7787
- Mar 11, 2025
- Informatica
The brain visual system is one of the core centers for human perception of external information. How to establish the brain visual cognitive system to classify and process image information is a key matter in the area of human-computer connection. In order to improve the accuracy of computer vision image classification, a fusion intelligent computing model based on deep convolutional neural network and brain visual cognition is proposed. This model simulates the visual processing mechanism of the human brain and uses brain computer interface technology to extract electroencephalogram signals, thereby achieving efficient classification and processing of image information. When designing an image classification model based on DCNN, a long short-term memory network structure is introduced to extract time series features of electroencephalogram signals. In order to enhance the classification accuracy of the model, attention mechanism and occlusion independent neural response methods are also applied to improve the accuracy of capturing the correlation information between brain response and image features. The results show that the prediction accuracy of the research model reaches 93.54% and 94.03% in the V4 visual region and L0 visual region, respectively. The highest accuracy on facial visual images reaches 95.46%, while the lowest accuracy on animal visual images is 91.57%. By introducing the long short-term memory module, the loss value of the model decreases from 0.26 to 0.21, with a reduction of 19.23%. In addition, ablation experiments show that by introducing attention mechanisms and occlusion independent neural responses, the final classification accuracy is improved to 93.94%. In summary, the research on the fusion intelligent computing model grounded on deep convolutional neural networks and brain visual cognition effectively improves the accuracy of image classification and demonstrated its potential in the field of intelligent computing.
- Research Article
4
- 10.7176/ceis/13-1-03
- Jan 1, 2022
- Computer Engineering and Intelligent Systems
Image segmentation and Image Classification are two fundamental tasks in computer vision. In this thesis, a novel segmentation algorithm based on a deformable model and robust estimation is introduced to produce reliable segmentation results. The algorithm is extended to handle touching objects and partially occluded image segmentation. Although current conventional image classification methods have been widely applied to realistic problems, there are some issues with their implementation, including unsatisfactory results, poor classification accuracy, and a lack of adaptive capacity. This approach has been used to isolate the two processes of image feature extraction and classification into two stages. The deep learning model possesses a strong learning capability, which enables it to incorporate the feature extraction and classification processes, thereby improving the image classification accuracy. This thesis explores various machine learning methods to improve the model's performance. The primary objective is to discover the accuracy of the various networks on the datasets and to evaluate the consistency of each of these deep learning predictions. Nonetheless, there are limitations to this approach: first, it is difficult to perform accurate approximation in the advanced model. The second point is that the deep learning model comes with poor accuracy in its classifier. So, this paper introduces the idea of using different datasets and models of the deep learning network and comprehensively utilizes it to determine the best test accuracy for the images. In this paper, a deep neural network primarily based on Keras and TensorFlow is deployed using python. The two datasets are used to compare to determine which has the maximum accurate and fine time for processing. And a VGG-16 model method based on the optimized kernel function is proposed to replace the classifier in the deep learning model. The experimental results show that the proposed method not only has higher average accuracy than other mainstream methods but also can be good adapted to various image databases. Compared with other deep learning methods, it can better solve the problems of complex function approximation and poor classifier effectiveness, thus further improving image classification accuracy. Keywords: Image Classification, Deep Learning, TensorFlow DOI: 10.7176/CEIS/13-1-03 Publication date: January 31 st 2022
- Research Article
- 10.1504/ijaacs.2020.10032588
- Jan 1, 2020
- International Journal of Autonomous and Adaptive Communications Systems
Deep learning algorithm based on convolutional neural network has been widely used in the field of computer vision. A method based on deep convolution neural network is proposed for face recognition under low illumination. Firstly, the multi-scale retinex is used to enhance the face image in low-light imaging. Then the processed signal is input into the four-layer depth convolution neural network. The classification model is generated by the iterative training of the neural network. Finally, the input face image is classified based on the classification model. Multi-scale retinex utilises the principle of human eye perception of object brightness. Convolutional neural network can achieve better convergence rate and accuracy in classification and recognition of face images. Experiments on YaleB dataset show that the proposed algorithm and network model have better recognition performance.
- Research Article
14
- 10.1007/s00500-020-04989-3
- May 1, 2020
- Soft Computing
Image classification has received extensive attention as an important technical means of acquiring image information. It has been widely used in various engineering fields. Although the existing traditional image classification methods have been widely applied in practical problems, there are some problems in the application process, such as unsatisfactory effects, low classification accuracy and weak adaptive ability. This is because this type of method relies on the designer’s prior knowledge and cognitive understanding of the classification task. At the same time, this method separates image feature extraction and classification into two steps for classification operation. However, the deep learning model has a powerful learning ability, which integrates the feature extraction and classification process into a whole to complete the image classification test, which can effectively improve the image classification accuracy. At the same time, the image classification method based on deep learning also has the following problems in the application process: First, it is impossible to effectively approximate the complex functions in the deep learning model. Second, the deep learning model comes with a low classifier with low accuracy. To this end, this paper introduces the idea of sparse representation into the architecture of deep learning network, comprehensively utilizes the sparse representation of good multidimensional data linear decomposition ability and the deep structural advantages of multi-layer nonlinear mapping to complete the complex function approximation in deep learning model. It constructs a deep learning model with adaptive approximation ability, which solves the function approximation problem of deep learning models. At the same time, in order to further improve the classification effect of the deep learning classifier, a sparse representation classification method based on the optimized kernel function is proposed to replace the classifier in the deep learning model, thereby improving the image classification effect. Based on the above explanation, this paper proposes an image classification algorithm based on the stacked sparse coding depth learning model-optimized kernel function nonnegative sparse representation. The experimental results show that the proposed method not only has a higher average accuracy than other mainstream methods, but also can be well adapted to various image databases. This is because the proposed method can extract more image feature information than the traditional image classification method and can better adaptively match the image information. Compared with other deep learning methods, it can better solve the problems of complex function approximation and poor classifier effect, thus further improving image classification accuracy.
- Research Article
9
- 10.22214/ijraset.2021.39280
- Dec 31, 2021
- International Journal for Research in Applied Science and Engineering Technology
Abstract: A Brain tumor is one aggressive disease. An estimated more than 84,000 people will receive a primary brain tumor diagnosis in 2021 and an estimated 18,600 people will die from a malignant brain tumor (brain cancer) in 2021.[8] The best technique to detect brain tumors is by using Magnetic Resonance Imaging (MRI). More than any other cancer, brain tumors can have lasting and life-altering physical, cognitive, and psychological impacts on a patient’s life and hence faster diagnosis and best treatment plan should be devised to improve the life expectancy and well-being of these patients. Neural networks have shown colossal accuracy in image classification and segmentation problems. In this paper, we propose comparative studies of various deep learning models based on different types of Neural Networks(ANN, CNN, TL) to firstly identify brain tumors and then classify them into Benign Tumor, Malignant Tumor or Pituitary Tumor. The data set used holds 3190 images on T1-weighted contrast-enhanced images which were cleaned and augmented. The best ANN model concluded with an accuracy of 78% and the best CNN model consisting of 3 convolution layers had an accuracy of 90%. The VGG16(retrained on the dataset) model surpasses other ANN, CNN, TL models for multi-class tumor classification. This proposed network achieves significantly better performance with a validation accuracy of 94% and an F1-Score of 91. Keywords: Artificial Neural Network(ANN), Convolution Neural Network (CNN), Transfer Learning(TL), Magnetic Resonance Imaging(MRI.)
- Research Article
- 10.3897/aca.8.e151406
- May 28, 2025
- ARPHA Conference Abstracts
Introduction Phytoplankton are microscopic organisms that form the foundation of aquatic food webs. Accurate identification and classification of phytoplankton species are crucial for monitoring all aquatic ecosystems, from marine to freshwater, understanding ecological dynamics, and assessing environmental changes. Traditional methods of phytoplankton identification, which rely on manual microscopy, are time-consuming and require expert knowledge. Recent advancements in machine learning, particularly Convolutional Neural Networks (CNNs), offer promising solutions for automating this process. This abstract explores the application of pre-trained CNNs in recognizing phytoplankton species, highlighting their advantages, methodologies, and potential impacts. Methodology We present three approaches from a marine site, the Gulf of Venice site of the LTER-Italy network (DEIMS.ID https://deims.org/758087d7-231f-4f07-bd7e-6922e0c283fd), which includes the 'Acqua Alta' Oceanographic Tower (AAOT) (Fig. 1), the brackishwater site Utö Atmospheric and Marine Research Station (ResNet-18, located at 59°46.84’ N, 21°22.13’ E) https://en.ilmatieteenlaitos.fi/uto, and the freshwater site the IGB-LakeLab in Lake Stechlin NE Germany (DEIMS.ID https://deims.org/2223bc9c-12b2-49fe-af73-4299f553e054). Three different architectures of CNN were used: VGG16 for the Gulf of Venice, ResNet-18 for the Finnish station and a YOLOv11-cls for the German Lake Stechlin LakeLab station. These CNN models were pre-trained on the ImageNet dataset and subsequently fine-tuned with specific datasets for the respective geographic areas. These CNNs were chosen for their ability to autonomously extract features from images without external assistance, making them efficient, fast tools for analyzing large amounts of data and due to their specificity regarding the characteristics of the observational site. The process involves several steps: Data Collection and Preprocessing : several public datasets are available (Ciranni et al. 2024), where each image is annotated according to its class. Each model is structured to require input images in a specific format, so depending on the chosen model, it is necessary to preprocess the images accordingly. With an Imaging Flow Cytobot (IFCB, an in-situ automated submersible imaging flow cytometer that generates images of particles in-flow taken from the aquatic environment.), the produced images are of good quality (Fig. 2), and the main modification applied is resizing the images to fit the model requirements; Transfer Learning : Transfer learning allows the weights of a pre-trained neural network to be retained and updated (only if specified) for specific tasks. It has been demonstrated that using pre-trained models leads to significant results, reducing both training time and the amount of data required compared to an untrained model (Maracani et al. 2023); Training and Validation : The modified CNN is trained on the annotated phytoplankton images. Techniques such as data augmentation (to increment the number of images), dropout, and batch normalization are employed to enhance model performance and prevent overfitting. The model's accuracy is validated using a separate dataset; Evaluation Metrics : Performance metrics, including accuracy, precision, recall, and F1-score, are used to evaluate the model. Confusion matrices and receiver operating characteristic (ROC) curves provide additional insights into the model's classification capabilities. Data Collection and Preprocessing : several public datasets are available (Ciranni et al. 2024), where each image is annotated according to its class. Each model is structured to require input images in a specific format, so depending on the chosen model, it is necessary to preprocess the images accordingly. With an Imaging Flow Cytobot (IFCB, an in-situ automated submersible imaging flow cytometer that generates images of particles in-flow taken from the aquatic environment.), the produced images are of good quality (Fig. 2), and the main modification applied is resizing the images to fit the model requirements; Transfer Learning : Transfer learning allows the weights of a pre-trained neural network to be retained and updated (only if specified) for specific tasks. It has been demonstrated that using pre-trained models leads to significant results, reducing both training time and the amount of data required compared to an untrained model (Maracani et al. 2023); Training and Validation : The modified CNN is trained on the annotated phytoplankton images. Techniques such as data augmentation (to increment the number of images), dropout, and batch normalization are employed to enhance model performance and prevent overfitting. The model's accuracy is validated using a separate dataset; Evaluation Metrics : Performance metrics, including accuracy, precision, recall, and F1-score, are used to evaluate the model. Confusion matrices and receiver operating characteristic (ROC) curves provide additional insights into the model's classification capabilities. Results Studies have demonstrated that pre-trained CNNs can achieve high accuracy in phytoplankton classification. In our case, models like ResNet and VGG have shown classification accuracies exceeding 80% on diverse phytoplankton datasets (Fig. 3, Kraft et al. 2022). These models effectively distinguish between species with subtle morphological differences, which are often challenging for human experts. Discussion The use of pre-trained CNNs in phytoplankton recognition offers several advantages: Efficiency : Automated classification significantly reduces the time and effort required for phytoplankton identification compared to manual methods. Scalability : CNNs can handle large volumes of image data, making them suitable for Long Term Ecological Research. Consistency : Machine learning models provide consistent and objective classifications, minimizing human error and variability. Efficiency : Automated classification significantly reduces the time and effort required for phytoplankton identification compared to manual methods. Scalability : CNNs can handle large volumes of image data, making them suitable for Long Term Ecological Research. Consistency : Machine learning models provide consistent and objective classifications, minimizing human error and variability. However, challenges remain. The automatic taxonomic identification level is still not as detailed as that of human expertise. The quality and diversity of training data are critical for model performance. Inadequate or biased datasets can lead to poor generalization. Additionally, the interpretability of CNNs is limited, making it difficult to understand the decision-making process fully. Conclusion Pretrained CNNs represent a powerful tool and a pipeline for phytoplankton species recognition, offering significant improvements in efficiency, scalability, and consistency over traditional methods. Continued advancements in machine learning and the availability of high-quality datasets will further enhance the capabilities of these models. Future research should focus on addressing current limitations, such as data quality and model interpretability, to fully realize the potential of CNNs in marine science. In this work, we will present the results as discussed to demonstrate possible workflows to fully realize the potential of CNNs in marine science and potentially contribute to the Standard Observations (SOs) addressing current limitations. We will also bring a workflow proposal to manage and perform actions related to harmonization, interoperability, quality control and sharing of the data obtained througth the CNNs recognitions following the directives proposed by Torstensson (2025).
- Research Article
6
- 10.3390/e26100882
- Oct 21, 2024
- Entropy (Basel, Switzerland)
In extremely dark conditions, low-light imaging may offer spectators a rich visual experience, which is important for both military and civic applications. However, the images taken in ultra-micro light environments usually have inherent defects such as extremely low brightness and contrast, a high noise level, and serious loss of scene details and colors, which leads to great challenges in the research of low-light image and object detection and classification. The low-light night vision image used as the study object in this work has an excessively dim overall picture and very little information about the screen's features. Three algorithms, HE, AHE, and CLAHE, were used to enhance and highlight the image. The effectiveness of these image enhancement methods is evaluated using metrics such as the peak signal-to-noise ratio and mean square error, and CLAHE was selected after comparison. The target image includes vehicles, people, license plates, and objects. The gray-level co-occurrence matrix (GLCM) was used to extract the texture features of the enhanced images, and the extracted image texture features were used as input to construct a backpropagation (BP) neural network classification model. Then, low-light image classification models were developed based on VGG16 and ResNet50 convolutional neural networks combined with low-light image enhancement algorithms. The experimental results show that the overall classification accuracy of the VGG16 convolutional neural network model is 92.1%. Compared with the BP and ResNet50 neural network models, the classification accuracy was increased by 4.5% and 2.3%, respectively, demonstrating its effectiveness in classifying low-light night vision targets.
- Conference Article
16
- 10.1109/picc51425.2020.9362375
- Dec 17, 2020
Image Classification is the task of assigning an input image to a label from a set of fixed labels. This is one of the main problems in computer vision that have many practical applications. For any classification problem, the main aim is to achieve better classification accuracy. If the classification accuracy is less, then misclassification happens and this will leads to different kinds of problems. Many of the classification models only consider the existing class instances. When a new class instance arrives the classification model not detect it properly. They actually misclassified the new class instance into an existing class instance. The proposed method therefore shows a better accurate classification and new class detection model for images. Also if needed, then the new class can be added with the model to classify correctly in the future. Recent studies show that Convolutional Neural Network(CNN) can be effectively used for image classification tasks. So here creating this better accurate classification and new class detection model based on CNN. The detection of a new class is done by looking into the trend of the softmax prediction score of class labels. In this work, the model is built for CIFAR10 image dataset. This dataset is actually a complex dataset, so creating a model for this dataset can consider as a base and extended for the classification and new class detection in other images in different applications.
- Research Article
2
- 10.54254/2755-2721/81/20241009
- Nov 8, 2024
- Applied and Computational Engineering
Abstract. This paper reviews the application and improvement of convolutional neural networks (CNNs) in image classification. Firstly, a shallow CNN for interstitial lung disease image classification is presented. This model suppresses overfitting through a unique network architecture and optimisation algorithm. Next, the improved VGG16 architecture and MIDNet18 model are discussed and their superior performance in brain tumour image classification is demonstrated. Subsequently, a CNN-CapsNet model for cervical cancer image classification and its improvement are presented and the customised model is compared with the conventional VGG-16 CNN architecture in the paper. Next, the application of sparse convolutional kernels and hybrid sparse convolutional kernels (HDCs) in solving the problem of computational resource consumption is presented. Subsequently, methods for solving the problem of limited training data through transfer learning and network data augmentation techniques are discussed, as well as GAN-generated datasets for solving the overfitting problem. Finally, the effect of degraded images on the classification effectiveness of CNNs is explored. The results show that the improved CNN architecture and algorithms have significant effects in solving the problems of overfitting and computational resource consumption, and can significantly improve the accuracy and efficiency of image classification. And degraded images do adversely affect the accuracy of CNN for image classification.
- Research Article
7
- 10.1016/j.aca.2023.341758
- Aug 28, 2023
- Analytica Chimica Acta
RamanCMP: A Raman spectral classification acceleration method based on lightweight model and model compression techniques
- Research Article
89
- 10.1155/2020/7607612
- Jan 31, 2020
- Scientific Programming
Although the existing traditional image classification methods have been widely applied in practical problems, there are some problems in the application process, such as unsatisfactory effects, low classification accuracy, and weak adaptive ability. This method separates image feature extraction and classification into two steps for classification operation. The deep learning model has a powerful learning ability, which integrates the feature extraction and classification process into a whole to complete the image classification test, which can effectively improve the image classification accuracy. However, this method has the following problems in the application process: first, it is impossible to effectively approximate the complex functions in the deep learning model. Second, the deep learning model comes with a low classifier with low accuracy. So, this paper introduces the idea of sparse representation into the architecture of the deep learning network and comprehensively utilizes the sparse representation of well multidimensional data linear decomposition ability and the deep structural advantages of multilayer nonlinear mapping to complete the complex function approximation in the deep learning model. And a sparse representation classification method based on the optimized kernel function is proposed to replace the classifier in the deep learning model, thereby improving the image classification effect. Therefore, this paper proposes an image classification algorithm based on the stacked sparse coding depth learning model-optimized kernel function nonnegative sparse representation. The experimental results show that the proposed method not only has a higher average accuracy than other mainstream methods but also can be good adapted to various image databases. Compared with other deep learning methods, it can better solve the problems of complex function approximation and poor classifier effect, thus further improving image classification accuracy.
- Research Article
110
- 10.1016/j.eswa.2023.122159
- Oct 18, 2023
- Expert Systems with Applications
Development of hybrid models based on deep learning and optimized machine learning algorithms for brain tumor Multi-Classification
- Research Article
- 10.54254/2755-2721/5/20230612
- May 31, 2023
- Applied and Computational Engineering
When it comes to natural language processing, the textual differentiation task is one of the classical and important research problems. Recently, the deep learning model has increasingly become one of the main methods to solve text classification problems. Common deep learning text classification models are convolutional neural networks (CNN), recurrent neural networks (RNN), the BERT model. For comparing the manifestation of various deep learning models in textual differentiation tasks horizontally, the thesis tests the classification accuracy of different deep learning models under the same experimental configuration. The experimental results show that using pre-trained word vectors helps to improve the classification accuracy of deep learning models. In addition, the reasonable design of more complex and larger deep learning models is helpful in enhancing the study capability of the specific model on text data. The experimental results indicate that the text classification model using pre-trained word vectors could gain higher accuracy than the model without pre-trained word vectors. In addition, in the comparison experiment of feedforward neural network (FNN), CNN, RNN and BERT model, BERT model performs best, and the text classification accuracy reaches 0.9232. Compared with a 1-layer FNN, the accuracy rate is increased by about 16%.
- Conference Article
2
- 10.1109/iccect60629.2024.10545770
- Apr 26, 2024
In the task of music genre classification, feature extraction and classifier modeling are the two key parts that directly affect the classification accuracy. In the traditional classification method, the feature extraction and classification processes are designed separately. First, the features are extracted manually from the original music signal, and then a reasonable classifier is selected to build a model and classify the extracted features. Although the traditional methods have achieved good results in many classification tasks, the feature extraction process is complex and difficult to achieve, and the features required for different classification tasks need to be specially designed, and the extracted features lack of universality. With the successful application and continuous development of deep learning models in other fields, more and more studies begin to use music spectra as the input of deep learning models to classify music genres. However, so far, the accuracy of existing classification methods based on deep learning is not ideal, so this paper mainly studies a classification method based on deep learning to improve the classification accuracy of music genre classification model. In this paper, a parallel structured deep attention classification model is proposed. By training BRNN, it can automatically learn music features from samples. The linear attention model calculates the attention probability distribution on this feature according to the learned features, and reassigns it to the feature representation. Finally, the classification is realized according to the eigenvectors with different weights. Besides the linear attention model with simple structure, a CNN attention model with stronger learning ability is also designed. In order to verify the feasibility and validity of the model, validation experiments were conducted on two standard datasets, GTZAN and Extended Ballroom. The experimental results show that the classification model based on deep parallel attention mechanism has good classification robustness, and the accuracy of classification based on BRNN and parallel CNN attention model reaches 92.7% on Extended Ballroom data set, which is better than the existing classification method based on deep learning. The validity and feasibility of the classification model are proved.