Crack and Spall Detection in Buildings using Yolov8 And Detectron2–Based Web Application
Cracks and spalls in building structures pose serious risks to safety and durability. Conventional methods of detecting these defects are manual, time-consuming, and error-prone. Hence, this study develops a web-based system for automated defect detection using deep learning models. Two object detection models (YOLOv8 and Detectron2), and a CNN model (Resnet18), were trained on 1798 annotated images which consisted of benchmark datasets (METU, VCC) and with locally acquired images. Classification and object detection were done on both datasets acquired. YOLOv8 achieved a weighted average of 99.0% precision, 99.0% recall, and 99.0% accuracy, while Resnet18 reached 98.8.0% precision, 98.8% recall, and 98.8% accuracy weighted average for mild crack, severe crack and spall. For the object detection, YOLOv8 (mask mAP50 of 93.0%) achieved superior segmentation accuracy than Detectron2 (mask mAP50 of 87.5%). Both models demonstrated strong performance in detecting spalls, mild and severe cracks (mAP > 0.73). The Detectron2 model was deployed in a web-based application to enable real-time crack and spall identification. These results confirm the feasibility of AI-assisted structural health monitoring and highlight pathways for improving crack detection through balanced datasets, synthetic augmentation, and higher-resolution training in both Nigerian and global contexts.
- Conference Article
5
- 10.1109/icears53579.2022.9751898
- Mar 16, 2022
Object detection is becoming a challenging problem in several computer vision related applications. The recently developed deep learning (DL) models enable to design of effective object detection models with enhanced outcomes. But it is difficult to attain many characteristics from the objects identified in real time. To resolve this issue, this study introduces an optimal RetinaNet with harmony search algorithm for dynamic and static object detection (RNHSA-DSOD) model. The proposed RNHSA-DSOD technique aims to identify the dynamic as well as static objects that exist in the input frame. Besides, the RNHSA-DSOD technique derives a RetinaNet based object detection model to recognize multiple objects. Next, harmony search algorithm (HSA) with multilayer perceptron (MLP) is applied for the classification of detected objects into multiple classes. The design of HSA for MLP model shows the novelty of the work. In order to demonstrate the enhanced outcomes of the RNHSA-DSOD technique, a series of simulations were carried out and the results indicated the enhanced outcomes of the RNHSA-DSOD technique over its recent techniques.
- Book Chapter
- 10.1007/978-981-99-1414-2_38
- Jan 1, 2023
Object detection is primary task in computer vision. The various CNN are majorly used by researchers to improve the classification and detection of objects present in video frames. Object detection is a prime task in self-driven cars, satellite images, robotics, etc. The proposed work is focused on improvement of object classification and detection in videos for video analytics. The key focus of work is identification and tuning of hyper-parameters in deep learning models. The deep learning-based object detection models are broadly classified into two categories, i.e., one-stage detector and two-stage detector. We have selected one-stage detector for experimentation. In this paper, a custom CNN model is given with hyper-parameter tuning and the results are compared with state of art models. It is found out that the hyper-parameter tuning on CNN models helps in improvement of object classification and detection accuracy of deep learning models.
- Research Article
10
- 10.1515/corrrev-2023-0027
- Aug 21, 2023
- Corrosion Reviews
Deep learning algorithm has a wide range of applications and excellent performance in the field of engineering image recognition. At present, the detection and recognition of buried metal pipeline defects still mainly rely on manual work, which is inefficient. In order to realize the intelligent and efficient recognition of pipeline magnetic flux leakage (MFL) inspection images, based on the actual demand of MFL inspection, this paper proposes a new object detection framework based on YOLOv5 and CNN models in deep learning. The framework first uses object detection to classify the targets in MFL images and then inputs the features containing defects into a regression model based on CNN according to the classification results. The framework integrates object detection and image regression model to realize the target classification of MFL pseudo color map and the synchronous recognition of metal loss depth. The results show that the target recognition ability of the model is good, its precision reaches 0.96, and the mean absolute error of the metal loss depth recognition result is 1.14. The framework has more efficient identification ability and adaptability and makes up for the quantification of damage depth, which can be used for further monitoring and maintenance strategies.
- Research Article
12
- 10.3390/electronics12030541
- Jan 20, 2023
- Electronics
Object detection is an important computer vision technique that has increasingly attracted the attention of researchers in recent years. The literature to date in the field has introduced a range of object detection models. However, these models have largely been English-language-based, and there is only a limited number of published studies that have addressed how object detection can be implemented for the Arabic language. As far as we are aware, the generation of an Arabic text-to-speech engine to utter objects’ names and their positions in images to help Arabic-speaking visually impaired people has not been investigated previously. Therefore, in this study, we propose an object detection and segmentation model based on the Mask R-CNN algorithm that is capable of identifying and locating different objects in images, then uttering their names and positions in Arabic. The proposed model was trained on the Pascal VOC 2007 and 2012 datasets and evaluated on the Pascal VOC 2007 testing set. We believe that this is one of a few studies that uses these datasets to train and test the Mask R-CNN model. The performance of the proposed object detection model was evaluated and compared with previous object detection models in the literature, and the results demonstrated its superiority and ability to achieve an accuracy of 83.9%. Moreover, experiments were conducted to evaluate the performance of the incorporated translator and TTS engines, and the results showed that the proposed model could be effective in helping Arabic-speaking visually impaired people understand the content of digital images.
- Conference Article
4
- 10.1109/iciss55894.2022.9915248
- Aug 10, 2022
In the next few years, the new generation of Autonomous Vehicles (AVs) promises an advanced level of self-driving experiences. One of the most challenging topics in AVs development is the readiness of object detection models in complex urban environments. Mixed traffic is a complex urban environment that contains much uncertainty and is composed of heterogeneous objects. Therefore, this paper evaluates benchmarking the pre-trained CNN model for object detection in a mixed traffic environment. The evaluation is conducted for five modern algorithms and architecture of neural networks, including Faster RCNN, SSD, YOLOv3, YOLOv4, and EfficientDet. Then, we provide a new dataset in the mixed traffic environment under night conditions for more accurate object detection. Moreover, we conduct the simulation by considering the performance parameters that are recall, precision, and F measure. The performance of our dataset is also compared to the MS-COCO dataset. The result shows that the average precision value of Faster RCNN, SSD, YOLOv3, YOLOv4, and EfficientDet is 16.70 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">%</sup> , 8.90%, 19.67%, 43.90%, and 55.56% respectively. It shows that YOLOv4 and EfficientDet provide better object detection accuracy than other CNN models.
- Conference Article
4
- 10.1109/icce-taiwan55306.2022.9869121
- Jul 6, 2022
Nowadays, there are lots of deep learning models being used, in the case of limited computing resources, the speed of performing object detection and semantic segmentation at the same time may encounter the problem of slowing down. To tackle this issue, we propose a multi-tasking learning model based on the Encoder-decoder CNN architecture, which merges the object detection and semantic segmentation models into one, thus could be trained with semantic segmentation task and object detection task at the same time, and applied on road and traffic object recognition in Taiwan's unique driving environment. Comparing to executing semantic segmentation and object detection models simultaneously, our proposed model has faster recognition speed and higher accuracy on Cityscapes dataset. The result shows that our proposed method can achieve faster recognition speed and maintain accuracy rate on an embedded platform of Nvidia Jetson TX2 with fewer computing resources.
- Research Article
2
- 10.1051/e3sconf/202339101093
- Jan 1, 2023
- E3S Web of Conferences
Object detection, a fundamental duty in computer vision that has a wide range of practical applications, they are surveillance, robotics, and autonomous driving. Recent developments of deep learning have got gradual improvemenrts in detection accuracy and speed. One of the most popular and effective deep learning models for object detection is YOLOv5. In this discussion, we an object detection model through YOLOv5 and its implementation for object detection tasks. We discuss the model’s architecture, training process, and evaluation metrics. Furthermore, we present experimental results on popular object detection benchmarks to demonstrate the efficacy and efficiency of YOLOv5 in detecting various objects in complex scenes. Our experiments states that YOLOv5 out performs other state of the art object detection models case of accuracy of detected image and speed of detection, making it a promising approach for real-world applications. Our work contributes to the growing body of research on deep learning-based object detection and provides valuable insights into the capabilities and limitations of YOLOv5. By improving accuracy, speed of object detection models, we have enabled a wide range of applications that can benefit society in countless ways.
- Research Article
5
- 10.1016/j.aei.2024.102355
- Jan 9, 2024
- Advanced Engineering Informatics
Backdoor Attacks with Wavelet Embedding: Revealing and enhancing the insights of vulnerabilities in visual object detection models on transformers within digital twin systems
- Research Article
27
- 10.1016/j.compag.2022.107081
- Jun 11, 2022
- Computers and Electronics in Agriculture
End-to-end deep learning for directly estimating grape yield from ground-based imagery
- Research Article
12
- 10.37934/araset.26.1.714
- Jan 25, 2022
- Journal of Advanced Research in Applied Sciences and Engineering Technology
Over the last few years, archaeologist have started to look at automated object detection for searching of potential historical sites, using object identification methods that includes neural network-based and non-neural network-based approaches. However, there is a scarcity of reviews on Convolutional Neural Networks (CNN) based Deep Learning (DL) models for object detection in the archaeological field. The purpose of this review is to examine existing research that has been implemented in the area of ancient structures object detection using Convolutional Neural Networks. Notably, CNN based object detection has the difficulty to draw a boundary box around the object and was implemented mainly for object classification. Various algorithms such as, the Region-based Convolutional Neural Network (R-CNN) and Mask Region-based Convolutional Neural Network (MR-CNN) was developed to solve this problem, yielding a more accurate, time-efficient, and bias-free deep learning model. This paper intends to provide a technical reference highlighting articles from Scopus, Web of Science, and IEEE Xplore databases pertaining to the usage of Convolutional Neural Network based techniques to detect structures and objects in the archaeological field.
- Research Article
3
- 10.3390/s24103025
- May 10, 2024
- Sensors
Deep learning models have significantly improved object detection, which is essential for visual sensing. However, their increasing complexity results in higher latency and resource consumption, making real-time object detection challenging. In order to address the challenge, we propose a new lightweight filtering method called L-filter to predict empty video frames that include no object of interest (e.g., vehicles) with high accuracy via hybrid time series analysis. L-filter drops those frames deemed empty and conducts object detection for nonempty frames only, significantly enhancing the frame processing rate and scalability of real-time object detection. Our evaluation demonstrates that L-filter improves the frame processing rate by 31-47% for a single traffic video stream compared to three standalone state-of-the-art object detection models without L-filter. Additionally, L-filter significantly enhances scalability; it can process up to six concurrent video streams in one commodity GPU, supporting over 57 fps per stream, by working alongside the fastest object detection model among the three models.
- Research Article
- 10.54254/2755-2721/2025.22252
- Apr 21, 2025
- Applied and Computational Engineering
Object detection has become an important task in the field of computer vision, where the goal is to recognize and classify objects in images or videos. This paper presents a comparative analysis of different object detection models, focusing on convolutional neural networks (CNN) and transformer-based architectures. CNN-based models (e.g., the YOLO family) have made significant progress in real-time object detection by efficiently extracting local features via convolutional operations. In contrast, transformer-based models, such as the Visual Transformer (ViT), use self-attention mechanisms to capture global dependencies, improving performance on large-scale datasets. This research explores the evolution of these models and examines their foundations, strengths, and weaknesses. Through experimental evaluations, we show that CNNs continue to dominate when data and computational power are limited, while Transformers exhibit superior scalability and accuracy in complex environments. Our results highlight the complementary nature of these approaches and emphasize the need for hybrid models to achieve optimal performance in different object detection tasks.
- Research Article
- 10.47611/jsrhs.v13i2.6533
- May 31, 2024
- Journal of Student Research
Global warming and pollution are huge problems today. One of the main factors behind these two problems is plastic pollution. As plastic takes millennia to decompose, an estimated 270,000 tons of plastic are floating around our oceans, which is too much to be considered “safe.” This problem is mainly caused by the fact that people improperly dispose of plastic, whether that be through littering or putting it into the trash instead of a recycling bin. To better identify and correct plastic that was inappropriately disposed of, a deep-learning model (YOLOv5) that uses object detection and classification was implemented to detect which bin someone’s trash should go in. The YOLOv5 uses the PyTorch framework for the object detection model. Using this model helped solve this problem, as a custom object-detection model would have needed to be developed, which would not have been efficient. The model was tested by trying to run the model on pieces of trash placed on a tabletop and analyzing the code output on which trash bin the waste will have to be thrown away. After conducting multiple tests, the model exhibited a commendable accuracy rate of 90%, which is noteworthy given the substantial amount of data leveraged. To further improve its efficacy and real-world value, future research could explore augmenting the training data, refining the object detection model for greater precision, and expanding the dataset to encompass a broader range of use cases.
- Research Article
146
- 10.3390/rs12132136
- Jul 3, 2020
- Remote Sensing
Mid- to late-season weeds that escape from the routine early-season weed management threaten agricultural production by creating a large number of seeds for several future growing seasons. Rapid and accurate detection of weed patches in field is the first step of site-specific weed management. In this study, object detection-based convolutional neural network models were trained and evaluated over low-altitude unmanned aerial vehicle (UAV) imagery for mid- to late-season weed detection in soybean fields. The performance of two object detection models, Faster RCNN and the Single Shot Detector (SSD), were evaluated and compared in terms of weed detection performance using mean Intersection over Union (IoU) and inference speed. It was found that the Faster RCNN model with 200 box proposals had similar good weed detection performance to the SSD model in terms of precision, recall, f1 score, and IoU, as well as a similar inference time. The precision, recall, f1 score and IoU were 0.65, 0.68, 0.66 and 0.85 for Faster RCNN with 200 proposals, and 0.66, 0.68, 0.67 and 0.84 for SSD, respectively. However, the optimal confidence threshold of the SSD model was found to be much lower than that of the Faster RCNN model, which indicated that SSD might have lower generalization performance than Faster RCNN for mid- to late-season weed detection in soybean fields using UAV imagery. The performance of the object detection model was also compared with patch-based CNN model. The Faster RCNN model yielded a better weed detection performance than the patch-based CNN with and without overlap. The inference time of Faster RCNN was similar to patch-based CNN without overlap, but significantly less than patch-based CNN with overlap. Hence, Faster RCNN was found to be the best model in terms of weed detection performance and inference time among the different models compared in this study. This work is important in understanding the potential and identifying the algorithms for an on-farm, near real-time weed detection and management.
- Research Article
13
- 10.1016/j.future.2023.09.030
- Sep 28, 2023
- Future Generation Computer Systems
Weather-aware object detection method for maritime surveillance systems
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.