Automatic inventory of archaeological artifacts based on object detection and classification using deep and transfer learning
Automatic inventory of archaeological artifacts based on object detection and classification using deep and transfer learning
- Research Article
- 10.57197/jdr-2023-0052
- Jan 1, 2023
- Journal of Disability Research
Object detection and classification systems can be devised to support visually challenged persons in communicating and understanding their environments. Such systems use computer vision methods for classifying and detecting objects in real time. Deep learning (DL) can be adopted to help visually challenged persons in object classification and detection tasks, allowing them to communicate and understand their surroundings more efficiently. By leveraging DL for object detection and classification, visually challenged individuals can receive real-time data regarding their interaction, surroundings, and overall independence and facilitate their navigation. With this motivation, the study presents a novel Stochastic Gradient Descent with Deep Learning-assisted Object Detection and Classification (SGDDL-ODC) technique for visually challenged people. The main intention of the SGDDL-ODC technique concentrates on the accurate and automated detection of objects to help visually challenged people. To obtain this, the SGDDL-ODC technique focused on the development of the optimal hyperparameter tuning of the DL models effectively. To accomplish this, the SGDDL-ODC technique follows the YOLOv6 model for object detection purposes. To adjust the hyperparameter values of the YOLOv6 method, the SGD model can be applied. At the final stage, the deep neural network method can be exploited for the classification of the recognized objects. A series of simulations were performed to validate the improved performance of the SGDDL-ODC approach. The simulation results illustrate the superior efficiency of the SGDDL-ODC technique over other techniques under diverse datasets in terms of different measures.
- Research Article
1
- 10.54216/fpa.170211
- Jan 1, 2025
- Fusion: Practice and Applications
Remote sensing (RS) object detection is extensively applied in the fields of civilian and military. The important role of remote sensing is to identify objects like planes, ships, harbours airports, etc., and then it can attain position information and object classification. It is of considerable importance to use RS images for observing the densely organized and directional objects namely ships and cars parked in harbours and parking areas. The object detection (OD) process involves object localization and classification. Due to its wide coverage and longer shooting distance, Remote sensing images (RSIs) have hundreds of smaller objects and dense scenes. Deep learning (DL), in particular convolution neural network (CNN), has revolutionized OD in different fields. CNN is devised to automatically learn the hierarchical representation of data, which makes them fit for feature extraction. Hence, the study proposes a new white shark optimizer with DL-based object detection and classification on RSI (WSODL-ODCRSI) method. The purpose of the WSODL-ODCRSI model is to classify and detect the presence of the objects in the RSI. To accomplish this, the WSODL-ODCRSI model uses a modified single-shot multi-box detector (MSSD) for the OD process. The next stage of OD is the object classification process, which takes place with the use of the Elman Neural Network (ENN) algorithm. The WSO algorithm is exploited as a parameter-tuning model for improving the object classification results of the ENN approach. The stimulated study of the WSODL-ODCRSI algorithm has been established on the benchmark data set and the outcomes underlined the promising performance of the WSODL-ODCRSI model on the object process of classification
- Book Chapter
9
- 10.1007/978-3-319-91635-4_1
- Jan 1, 2018
At construction sites and disaster areas, an enormous number of digital photographs are taken by engineers. Tasks such as collecting, sorting, annotating, storing, deleting, distributing these digital images, as done manually, are cumbersome, error-prone, and time-consuming. Thus, it is desirable to automate the object detection process of pictures so that engineers do not have to waste their valuable time and can improve the efficiency and accuracy. Although conventional machine learning could be a solution, it takes much time for researchers to determine features and contents of digital images, and the accuracy tends to be unsatisfactory. On the other hand, deep learning can automatically determine features and contents of various objects from digital images. Therefore, this research aims to automatically detect each object as an object and its position from digital images by using deep learning. Since deep learning usually requires a very large amount of dataset, this research has adopted deep learning with transfer learning, which enables object detection even if the dataset is not very large. Experiments were executed to detect construction machines, workers, and signboards in photographs, comparing among the conventional machine learning by feature values, deep learning with and without transfer learning. The result showed that the best performance was achieved by the deep learning with transfer learning.
- Research Article
29
- 10.1007/s11042-021-10833-z
- Apr 5, 2021
- Multimedia Tools and Applications
Object detection is a basic part in remote sensing image processing. At present, it is more common to conduct the topic based on deep learning, however the volume of remote sensing images has become a limitation. In order to solve the problem of small sample of remote sensing image, transfer learning is combined with deep learning in the research. First, the detection problem is caused by insufficient data, such as over-fitting, which is solved by model-based transfer learning. The structure of models and parameters obtained based on natural images are transferred to the detection task in remote sensing target domain. In addition, it is usually assumed that the distribution of training data and the testing data are the same in detection, but this is not the case. Therefore, how to improve the robustness of training models and widen the scope of application should be taken into consideration. In the research, Domain Adaptation Faster R-CNN (DA Faster R-CNN) algorithm is proposed for detecting aircraft in remote sensing images. Two domain adaptation structures are designed and selected as the criterion of similarity measurement between domains. Adversarial training is applied to alleviate the domain shift. Finally, the effectiveness of the algorithm is certified in the low brightness experiment. DA Faster R-CNN detection algorithm improves the accuracy of the original algorithm for low quality images. It is worth noting that the DA Faster R-CNN algorithm is a kind of unsupervised transfer learning method for remote sensing object detection.
- Research Article
2
- 10.21271/zjpas.34.2.3
- Apr 12, 2022
- ZANCO JOURNAL OF PURE AND APPLIED SCIENCES
Comprehensive Study for Breast Cancer Using Deep Learning and Traditional Machine Learning
- Book Chapter
10
- 10.1007/978-981-15-9735-0_2
- Jan 1, 2021
There is a wide spectrum of different deep learning (DL) architectures available for medical image analysis. Among this convolution networks (CNN) found to be more efficient for variety of medical imaging task including segmentation, object detection, disease classification, severity grading, etc. In medical image analysis, accuracy of prediction is of utmost importance. In machine learning or deep learning, quantity and quality of medical image dataset plays a important role for ensuring the accuracy of future prediction. Otherwise because of less number of poor quality images, machine or deep learning models fail to predict accurately. This limitation of less quantity and less quality medical image dataset is almost removed to major extent by the transfer learning concept of deep learning. Transfer learning concept of deep learning makes the pertained models available for customization to specific application needs. Either pre-trained models are fine-tuned on the underlying data or used as feature extractors. As these pertained models are already trained on large datasets, the accurate set of generic features can be extracted to improve the overall performance and computational complexity. Because of transfer learning, limitation of large dataset requirement is removed to a greater extent and also the training cost in terms of number of parameters to be learned, training time, hardware computing cost is reduced. Plenty of pre-trained models are available including AlexNet, LeNet, MobileNet, GoogleNet, etc. Currently, many researchers are applying DL to obtain promising results in a wide variety of medical image analysis for almost all diseases including all types of cancers, pathological diseases, orthopedic diseases, etc. The proposed chapter covers introduction to deep learning, transfer learning, different award winning architectures for transfer learning, different resources for medical imaging research. This is followed by a brief case study of use of transfer learning for malaria diagnosis. The chapter also highlights on the future research directions in the domain of medical image analysis.KeywordsMedical image analysisDeep learningTransfer learning
- Book Chapter
12
- 10.1007/978-3-030-03131-2_4
- Jan 1, 2019
Object detection and classification have observed large amount of transformation and research after the advances in machine learning algorithms. The advancement in the computing power and data availability is complimenting this transformation in object detection. In recent times, research in the field of object detection is dominated by special type of neural network called Convolutional Neural Network (CNN). The object detection system has to localize objects in an image and accurately classify it. CNN is well suited for this task as it can accurately find features like edges, corners and even more advanced features needed to detect object. This chapter provides detailed overview on how CNN works and how it is useful in object detection and classification task. After that popular deep networks based on CNN like ResNet, VGG16, VGG19, GoogleNet and MobileNet are explained in detail. These networks worked well for object classification task but needed sliding window technique for localizing object in an image. It worked slowly as it needed to process many windows for a single image. This led to more advanced algorithms for object detection based on CNN like Convolutional Neural Network with Region proposals (R-CNN), fast R-CNN, faster R-CNN, Single shot multi-box detector (SSD) and You Only Look Once (YOLO). This chapter provides a detail explanation of how these algorithms work and comparison between them. Most of the deep learning algorithms require large amount of data and dedicated hardware like GPUs to train. To overcome this, the concept of transfer learning is discovered. In that pre-trained models of popular CNN architecture are used to solve new problems. So in the last part of the chapter this concept of transfer learning and when it is useful is explained.
- Research Article
21
- 10.1016/j.aca.2022.339668
- Mar 1, 2022
- Analytica Chimica Acta
Deep learning (DL) being popularly used in computer vision applications is still in its early stage in chemometric domain for spectral image processing. Often the challenge is that there are too few samples from analytical laboratory experiments to preform DL. In this study, we present a novel combination of DL and chemometrics to process spectral images even with as few as < 100 spectral images. We divided the image processing part such as object detection and recognition as the DL task and prediction of chemical property as the chemometric task based on latent space modelling. For image processing tasks of object detection and recognition, transfer learning was performed on the pretrained YOLOv4 object detection network weights to adapt the model to work well on spectral images captured in laboratory settings. Once the object is identified with DL, a background query is performed for the pre-built chemometric models to select the model for predicting the properties for specific object. The obtained results showed good potential of using DL and chemometric approaches in conjunction to reap the best of both scientific domains. This approach is of high interest to whoever involved in spectral imaging and dealing with object detection and physicochemical properties prediction of the samples with chemometric approaches.
- Conference Article
4
- 10.1109/icacc-202152719.2021.9708373
- Oct 21, 2021
Advancements in machine learning and deep learning avails the opportunity to enhance our customisation to crucial problems widely in any domain. Object detection in underwater sonar is evolving and deep learning provides reliable techniques. In our experiments we approached the sonar object classification with transfer learning and ensemble approach which produced better results than single machine learning and deep learning algorithms for the task. The preliminary step of feature extraction preserves complex and significant structures from the image data and improves classification performance. Also experiment model overcomes the scarce training data with predefined model, ResNet50. Optimized classification results achieved with ensemble classifiers for the sonar objects.
- Research Article
16
- 10.32604/cmc.2022.024431
- Jan 1, 2022
- Computers, Materials & Continua
The Smart City concept revolves around gathering real time data from citizen, personal vehicle, public transports, building, and other urban infrastructures like power grid and waste disposal system. The understandings obtained from the data can assist municipal authorities handle assets and services effectually. At the same time, the massive increase in environmental pollution and degradation leads to ecological imbalance is a hot research topic. Besides, the progressive development of smart cities over the globe requires the design of intelligent waste management systems to properly categorize the waste depending upon the nature of biodegradability. Few of the commonly available wastes are paper, paper boxes, food, glass, etc. In order to classify the waste objects, computer vision based solutions are cost effective to separate out the waste from the huge dump of garbage and trash. Due to the recent developments of deep learning (DL) and deep reinforcement learning (DRL), waste object classification becomes possible by the identification and detection of wastes. In this aspect, this paper designs an intelligence DRL based recycling waste object detection and classification (IDRL-RWODC) model for smart cities. The goal of the IDRL-RWODC technique is to detect and classify waste objects using the DL and DRL techniques. The IDRL-RWODC technique encompasses a two-stage process namely Mask Regional Convolutional Neural Network (Mask RCNN) based object detection and DRL based object classification. In addition, DenseNet model is applied as a baseline model for the Mask RCNN model, and a deep Q-learning network (DQLN) is employed as a classifier.Moreover, a dragonfly algorithm (DFA) based hyperparameter optimizer is derived for improving the efficiency of the DenseNet model. In order to ensure the enhanced waste classification performance of the IDRL-RWODC technique, a series of simulations take place on benchmark dataset and the experimental results pointed out the better performance over the recent techniques with maximal accuracy of 0.993.
- Research Article
132
- 10.1016/j.eswa.2023.122807
- Dec 2, 2023
- Expert Systems with Applications
Deep learning has emerged as a powerful tool in various domains, revolutionising machine learning research. However, one persistent challenge is the scarcity of labelled training data, which hampers the performance and generalisation of deep learning models. To address this limitation, researchers have developed innovative methods to overcome data scarcity and enhance deep model learning capabilities. Two prevalent techniques that have gained significant attention are transfer learning and self-supervised learning. Transfer learning leverages knowledge learned from pre-training on a large-scale dataset, such as ImageNet, and applies it to a target task with limited labelled data. This approach allows models to benefit from the learned representations and effectively transfer knowledge to new tasks, resulting in improved learning performance and generalisation. On the other hand, self-supervised learning focuses on training models using pretext tasks that do not require manual annotation, allowing them to learn valuable representations from large amounts of unlabelled data. These learned representations can then be fine-tuned for downstream tasks, mitigating the need for extensive labelled data. In recent years, transfer and self-supervised learning have found applications in various fields, including medical image processing, video recognition, and natural language processing. These approaches have demonstrated remarkable achievements, enabling breakthroughs in areas such as disease diagnosis, object recognition, and language understanding. However, while these methods offer numerous advantages, they also have limitations. For example, transfer learning may face domain mismatch issues between the pre-training and target domains, while self-supervised learning requires careful design of pretext tasks to ensure meaningful representations. This review paper explores the recent applications of these pre-training methods in various fields within the past three years. It delves into the advantages and limitations of each approach, assesses the performance of models employing these techniques, and identifies potential directions for future research. By providing a comprehensive review of current pre-training methods, this article offers guidance for selecting the best technique for specific deep learning applications to address the data scarcity issue.
- Research Article
10
- 10.3390/diagnostics13122110
- Jun 19, 2023
- Diagnostics
Transfer learning has gained importance in areas where there is a labeled data shortage. However, it is still controversial as to what extent natural image datasets as pre-training sources contribute scientifically to success in different fields, such as medical imaging. In this study, the effect of transfer learning for medical object detection was quantitatively compared using natural and medical image datasets. Within the scope of this study, transfer learning strategies based on five different weight initialization methods were discussed. A natural image dataset MS COCO and brain tumor dataset BraTS 2020 were used as the transfer learning source, and Gazi Brains 2020 was used for the target. Mask R-CNN was adopted as a deep learning architecture for its capability to effectively handle both object detection and segmentation tasks. The experimental results show that transfer learning from the medical image dataset was found to be 10% more successful and showed 24% better convergence performance than the MS COCO pre-trained model, although it contains fewer data. While the effect of data augmentation on the natural image pre-trained model was 5%, the same domain pre-trained model was measured as 2%. According to the most widely used object detection metric, transfer learning strategies using MS COCO weights and random weights showed the same object detection performance as data augmentation. The performance of the most effective strategies identified in the Mask R-CNN model was also tested with YOLOv8. Results showed that even if the amount of data is less than the natural dataset, in-domain transfer learning is more efficient than cross-domain transfer learning. Moreover, this study demonstrates the first use of the Gazi Brains 2020 dataset, which was generated to address the lack of labeled and qualified brain MRI data in the medical field for in-domain transfer learning. Thus, knowledge transfer was carried out from the deep neural network, which was trained with brain tumor data and tested on a different brain tumor dataset.
- Research Article
7
- 10.1007/s12145-020-00486-1
- Jul 14, 2020
- Earth Science Informatics
The huge amount of active research has been focused on developing the remote sensing based applications for providing the object classification procedure focusing on the energy reflected on the earth surface, the remote sensor collects data and its management with analysis of all kinds of spatial information with enhanced accuracy. This paper proposes the Real-time Image processing method to implement the object classification and detection (RTIP-ODC) technique for remote sensing images. The enhanced feature extraction procedure likes preprocessing, object detection, classification and validation will improve the efficiency of the proposed technique. The classification method facilitates the user to preserve the process of object classification and enhances the accurate object detection. The proposed technique has obtained the enhanced performance to ensure the efficiency of the object classification compared to the related techniques.
- Research Article
- 10.54254/2755-2721/13/20230709
- Oct 23, 2023
- Applied and Computational Engineering
Crop disease detection is an important factor in agricultural production. Traditional object detection methods can't effectively screen key features, resulting in weak crop disease control in many countries. In recent years, several convolutional neural networks for object detection have been proposed, which makes it possible to apply computer vision to crop disease identification through deep learning. YOLOv5 is an advanced object detection network, which can extract key features and use human visual attention mechanism as a reference. This paper would evaluate the performance of four pre-trained models of YOLOv5 in object detection of crop diseases. And transfer learning was used to train the corresponding dataset. The experiment results have showed that the F1 values of the four models all reached above 0.93, and the Yolov5x got the best result, which achieved 0.963. Furthermore, the detection accuracy of the four models has reached more than 98%. This shows that the YOLOv5 series network models have great application prospects in the identification of crop diseases. In the near future, the object detection model can be applied to various mobile devices, even unmanned aerial vehicles, which would play a significant role in crop disease prevention.
- Research Article
- 10.70135/seejph.vi.5557
- Mar 8, 2025
- South Eastern European Journal of Public Health
White blood cells (WBCs), are essential constituents of the immune system, by providing organism's defence against infections, inflammation, and various diseases. Types of WBCs, possessing specific functions and attributes that are vital for the maintenance of health. The categories of WBCs are neutrophils, lymphocytes, monocytes, eosinophils, and basophils, each performing functions in the boost immune response. YOLOv5 stands out as a top-tier object detection model recognized for its speed and accuracy in detecting objects from images. The YOLOv5 model serves as an effective tool for detecting and classifying different white blood cell types in blood smear images during WBC classification and identification tasks. In Transfer learning A model developed for a specific task is reused with change in hyperparameters, as the starting point for a model on a second task. This approach is particularly valuable in scenarios where limited labelled data is available for the target task, allowing practitioners to leverage knowledge gained from related tasks with ample data. Transfer learning is commonly used in deep learning, especially for tasks such as image classification, natural language processing, and speech recognition. Kaggle and LISC original data set has limited number of images and YOLOv5 is state of art model for object detection and classification, and the employing transfer learning is the core idea of this research. Learning features from Original LISC data set and applying best weights produced from previous run we applied that weight on Augmented images of Kaggle data set and produced accuracy of 99.5mAP@50, recall 99 and F1-Score 99.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.