Deep Learning Implementation for Peruvian Blueberry Export Standards: A YOLOv8n Solution

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

This research introduces a computer vision system based on YOLOv8n, an efficient convolutional neural network (CNN) architecture, for real-time classification of blueberry maturity stages. The developed solution automates the detection of three distinct maturity stages (green, semi-ripe, and ripe) to support precision agriculture applications and ensure compliance with Peruvian Technical Standards (NTP) for blueberry exports. Through extensive experimentation using a curated dataset of 550 field-acquired images, the optimized model achieves competitive performance across key evaluation metrics—including precision, recall, average precision (AP), mean average precision (mAP), and F1 score—when compared to relevant previous studies. These results demonstrate the potential of the approach to enhance harvesting efficiency and guarantee adherence to international export quality requirements through automated visual inspection.

Similar Papers
  • Research Article
  • Cite Count Icon 319
  • 10.1016/j.ecoinf.2018.10.002
Deep convolution neural network for image recognition
  • Oct 12, 2018
  • Ecological Informatics
  • Boukaye Boubacar Traore + 2 more

Deep convolution neural network for image recognition

  • Research Article
  • 10.37965/jait.2023.0163
Human Activity Recognition in a Realistic and Multiview Environment Based on Two-Dimensional Convolutional Neural Network
  • May 9, 2023
  • Journal of Artificial Intelligence and Technology
  • Ashish Khare + 2 more

Recognition of human activity based on convolutional neural network has received the interest of researchers in recent years due to its significant improvement in accuracy. A large number of algorithms based on the deep learning approach have been proposed for activity recognition purpose. However, with the increasing advancements in technologies having limited computational resources, it needs to design an efficient deep learning-based approaches with improved utilization of computational resources. This paper presents a simple and efficient 2-Dimensional convolutional neural network (2-D CNN) architecture with very small size convolutional kernel for human activity recognition. The merit of the proposed CNN architecture over standard deep learning architectures is fewer trainable parameters and lesser memory requirement which enables it to train the proposed CNN architecture on low GPU memory-based devices and also works well with smaller as well as larger size datasets. The proposed approach consists of mainly four stages: namely (1) creation of dataset and data augmentation, (2) designing 2-D convolutional neural network (CNN) architecture, (3) the proposed 2-D CNN architecture trained from scratch up to optimum stage, and (4) evaluation of the trained 2-D CNN architecture. To illustrate the effectiveness of the proposed architecture several extensive experiments are conducted on three publicly available datasets, namely IXMAS, YouTube, and UCF101 dataset. The results of the proposed method and its comparison with other state-of-the-art methods [8-12,14,18-26,29-33] demonstrate the usefulness of the proposed method.

  • Conference Article
  • Cite Count Icon 13
  • 10.1109/ivs.2018.8500598
An efficient encoder-decoder CNN architecture for reliable multilane detection in real time
  • Jun 1, 2018
  • Shriyash Chougule + 5 more

Multilane detection system is a vital prerequisite for realizing higher ADAS functionality of autonomous navigation. In this work, we present an efficient convolutional neural network (CNN) architecture for real time detection of multiple lane boundaries using a camera sensor. Our network has a simple encoder-decoder architecture and is a special two class semantic segmentation network designed to segment lane boundaries. Efficacy of our network stems from two key insights which are at the foundation of all our design decisions. Firstly, we term a lane boundary as a weak class object in the context of semantic segmentation. We show that the weak class objects which occupy relatively few pixels in the scene, also have a relatively low detection accuracy among the know segmentation methods. We present novel design choices and intuitions to improve the segmentation accuracy of weak class objects, which in turn reduces computation time. Our second insight lies in the manner we depict the ground truth information in our derived dataset. Instead of annotating just the visible lane markers, we accurately delineate the lane boundaries in the ground truth for challenging scenarios like occlusions, low light and degraded lane markings. We then leverage the CNN's ability to concisely summarize the global and local context in an image, for accurately inferring lane boundaries in these challenging cases. We evaluate our network against ENet and FCN-8, and found it performing notably better in terms of speed and accuracy. Our network achieves an encouraging 46 FPS performance on NVIDIA Drive PX2 platform and it has been validated on our test vehicle in highway driving conditions.

  • Research Article
  • Cite Count Icon 297
  • 10.1016/j.fcr.2019.02.022
Deep convolutional neural networks for rice grain yield estimation at the ripening stage using UAV-based remotely sensed images
  • Mar 17, 2019
  • Field Crops Research
  • Qi Yang + 4 more

Deep convolutional neural networks for rice grain yield estimation at the ripening stage using UAV-based remotely sensed images

  • Research Article
  • Cite Count Icon 108
  • 10.1016/j.patcog.2020.107610
Efficient densely connected convolutional neural networks
  • Aug 20, 2020
  • Pattern Recognition
  • Guoqing Li + 4 more

Efficient densely connected convolutional neural networks

  • Conference Article
  • Cite Count Icon 30
  • 10.1109/itsc.2018.8569962
Vehicle Detection and Localization using 3D LIDAR Point Cloud and Image Semantic Segmentation
  • Nov 1, 2018
  • Rafael Barea + 7 more

This paper presents a real-time approach to detect and localize surrounding vehicles in urban driving scenes. We propose a multimodal fusion framework that processes both 3D LIDAR point cloud and RGB image to obtain robust vehicle position and size in a Bird's Eye View (BEV). Semantic segmentation from RGB images is obtained using our efficient Convolutional Neural Network (CNN) architecture called ERFNet. Our proposal takes advantage of accurate depth information provided by LIDAR and detailed semantic information processed from a camera. The method has been tested using the KITTI object detection benchmark. Experiments show that our approach outperforms or is on par with other state-of-the-art proposals but our CNN was trained in another dataset, showing a good generalization capability to any domain, a key point for autonomous driving.

  • Research Article
  • 10.1109/tmi.2025.3641192
Expert-Like Reparameterization of Heterogeneous Pyramid Receptive Fields in Efficient CNNs for Fair Medical Image Classification.
  • Jan 1, 2025
  • IEEE transactions on medical imaging
  • Xiao Wu + 5 more

Efficient convolutional neural network (CNN) architecture design has attracted growing research interests. However, they typically apply single receptive field (RF), small asymmetric RFs, or pyramid RFs to learn different feature representations, still encountering two significant challenges in medical image classification tasks: i) They have limitations in capturing diverse lesion characteristics efficiently, e.g., tiny, coordination, small and salient, which have unique roles on the classification results, especially imbalanced medical image classification. ii) The predictions generated by those CNNs are often unfair/biased, bringing a high risk when employing them to real-world medical diagnosis conditions. To tackle these issues, we develop a new concept, Expert-Like Reparameterization of Heterogeneous Pyramid Receptive Fields (ERoHPRF), to simultaneously boost medical image classification performance and fairness. This concept aims to mimic the multi-expert consultation mode by applying the well-designed heterogeneous pyramid RF bag to capture lesion characteristics with varying significances effectively via convolution operations with multiple heterogeneous kernel sizes. Additionally, ERoHPRF introduces an expertlike structural reparameterization technique to merge its parameters with the two-stage strategy, ensuring competitive computation cost and inference speed through comparisons to a single RF. To manifest the effectiveness and generalization ability of ERoHPRF, we incorporate it into mainstream efficient CNN architectures. The extensive experiments show that our proposed ERoHPRF maintains a better trade-off than state-of-the-art methods in terms of medical image classification, fairness, and computation overhead. The code of this paper is available at https://github.com/XiaoLing12138/Expert-Like-Reparameterization-of-Heterogeneous-Pyramid-Receptive-Fields.

  • Research Article
  • Cite Count Icon 133
  • 10.1016/j.agrformet.2020.107938
A near real-time deep learning approach for detecting rice phenology based on UAV images
  • Feb 19, 2020
  • Agricultural and Forest Meteorology
  • Qi Yang + 4 more

A near real-time deep learning approach for detecting rice phenology based on UAV images

  • Conference Article
  • Cite Count Icon 78
  • 10.21437/interspeech.2016-123
Robust Audio Event Recognition with 1-Max Pooling Convolutional Neural Networks
  • Sep 8, 2016
  • Huy Phan + 3 more

We present in this paper a simple, yet efficient convolutional neural network (CNN) architecture for robust audio event recognition. Opposing to deep CNN architectures with multiple convolutional and pooling layers topped up with multiple fully connected layers, the proposed network consists of only three layers: convolutional, pooling, and softmax layer. Two further features distinguish it from the deep architectures that have been proposed for the task: varying-size convolutional filters at the convolutional layer and 1-max pooling scheme at the pooling layer. In intuition, the network tends to select the most discriminative features from the whole audio signals for recognition. Our proposed CNN not only shows state-of-the-art performance on the standard task of robust audio event recognition but also outperforms other deep architectures up to 4.5% in terms of recognition accuracy, which is equivalent to 76.3% relative error reduction.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 4
  • 10.1155/2022/2213295
Image Target Recognition Based on Improved Convolutional Neural Network
  • Jul 8, 2022
  • Mathematical Problems in Engineering
  • Jinjuan Wang + 4 more

Convolutional neural network (CNN) algorithm is a very important branch of deep learning research, which has been widely applied in many fields and achieved excellent results, especially in computer vision, where convolutional neural network has made breakthroughs in image classification and object detection. Convolutional neural network architecture can realize more efficient network training through the final combination of different modules, and the convolutional neural network training does not need to actively extract image features and can directly carry out end-to-end training and prediction. At first, this paper analyzed some problems of the current image recognition and expounds the progress of convolution neural network in image recognition and then studied the traditional algorithm of target recognition, including traditional recognition algorithm framework of target, the target orientation, feature extraction, classifier classification, etc., and the traditional target recognition algorithm is compared with those of the target recognition algorithm of deep learning. On the basis of the above research, an improved model of CNN is proposed, which focuses on the structural design and network optimization of convolutional neural network and designs a more efficient convolutional neural network. Test experiments verify the effectiveness of the proposed model, which not only achieves lower error rate, but also greatly reduces the number of network parameters and has stronger learning ability.

  • Research Article
  • Cite Count Icon 3
  • 10.1016/j.jormas.2024.102152
An artificial intelligence mechanism for detecting cystic lesions on CBCT images using deep learning.
  • Dec 1, 2025
  • Journal of stomatology, oral and maxillofacial surgery
  • Rasool Esmaeilyfard + 2 more

An artificial intelligence mechanism for detecting cystic lesions on CBCT images using deep learning.

  • Book Chapter
  • Cite Count Icon 24
  • 10.1007/978-3-642-28997-2_15
New Metrics for Meaningful Evaluation of Informally Structured Speech Retrieval
  • Jan 1, 2012
  • Maria Eskevich + 2 more

Search effectiveness for tasks where the retrieval units are clearly defined documents is generally evaluated using standard measures such as mean average precision (MAP). However, many practical speech search tasks focus on content within large spoken files lacking defined structure. These data must be segmented into smaller units for search which may only partially overlap with relevant material. We introduce two new metrics for the evaluation of search effectiveness for informally structured speech data: mean average segment precision (MASP) which measures retrieval performance in terms of both content segmentation and ranking with respect to relevance; and mean average segment distance-weighted precision (MASDWP) which takes into account the distance between the start of the relevant segment and the retrieved segment. We demonstrate the effectiveness of these new metrics on a retrieval test collection based on the AMI meeting corpus.KeywordsSpeech retrievalinformally structured speechevaluation metrics

  • Research Article
  • 10.11591/ijeecs.v38.i3.pp2012-2019
Skin cancer disease analysis using classification mechanism based on 3D feature extraction
  • Jun 1, 2025
  • Indonesian Journal of Electrical Engineering and Computer Science
  • Ramya Srikanteswara + 1 more

<p>Dermoscopic image analysis is essential for effective skin cancer diagnosis and classification. Extensive research work has been carried out on dermoscopic image classification for the early detection of skin cancer. However, most of the research works are concentrated on 2D features. Therefore, a 3D lesion establishment mechanism is presented in this work to generate 3D features from the obtained 3D lesions. The objective of this work is to reconstruct 3D lesion image from 2D lesion images and a multispectral reference IR light image. The 3D lesion establishment is achieved by designing an efficient convolutional neural network (CNN) architecture. Details of CNN design architecture are discussed. After reconstruction of 3D lesions, 2D and 3D features are extracted and classification is performed on the obtained 2D and 3D features. Classification performance is evaluated using the images from PH2 database. The mean classification accuracy using K-nearest neighbors (KNN) classifier based on the 3D lesion establishment using the CNN architecture is 98.70%. The performance results are compared against varied classification methods in terms of accuracy, sensitivity, specificity and are proved to be better.</p>

  • Conference Article
  • Cite Count Icon 4
  • 10.1109/globecom46510.2021.9685642
Densely-Accumulated Convolutional Network for Accurate LPI Radar Waveform Recognition
  • Dec 1, 2021
  • Thien Huynh-The + 6 more

This paper presents a deep learning-based method to automatically recognize low probability of intercept (LPI) radar waveforms against diversified jamming attacks. Concretely, an efficient convolutional neural network (CNN) architecture, namely Densely-Accumulated Network (DANet), is introduced to learn the time-frequency representation transformed by the Wigner-Ville distribution. Such an architecture has several novel densely-accumulated connection modules specified by various symmetric and asymmetric convolutional layers to enrich diversified features at multiple representational maps. Besides, the skip-connection and dense-connection are leveraged to improve feature learning efficiency and prevent the vanishing gradient when the network goes deeper. Some image processing techniques (e.g., global thresholding and digital filtering) are adopted to enhance the quality of time-frequency image. Relying on simulations, we benchmark the proposed method on a synthetic 13-waveform dataset and also investigate the influence of hyper-parameters (such as image size, number of modules, training data size) on the overall recognition performance. Remarkably, with average accuracy of 98.2% at 0 dB signal-to-noise ratio (SNR), DANet outperforms several backbone CNNs and state-of-the-art networks of LPI waveform recognition while keeping a cost-efficient model.

  • Research Article
  • 10.47709/brilliance.v5i1.6259
Comparative Analysis of MobileNetV3-Large and Small for Corn Leaf Disease Classification
  • Jul 7, 2025
  • Brilliance: Research of Artificial Intelligence
  • Wesley Maximilliano + 1 more

Corn leaf disease represents a significant threat to agricultural productivity, capable of causing substantial economic losses in Indonesia. Conventional identification methods, which rely on visual observation by farmers, are frequently subjective, time-consuming, and inaccurate. This study conducts a systematic comparative analysis of two efficient Convolutional Neural Network (CNN) architecture variants, MobileNetV3-Large and MobileNetV3-Small, for the classification of four corn leaf conditions: Gray Leaf Spot, Common Rust, Northern Leaf Blight, and Healthy. The research further evaluates the influence of two prevalent optimizers, Adam and Stochastic Gradient Descent (SGD), to ascertain the most optimal model configuration through hyperparameter tuning. The models were trained and evaluated using a local image dataset from Sampang, Indonesia, comprising 4000 images. The methodology included image preprocessing, data augmentation, and hyperparameter tuning of the learning rate and batch size. The results demonstrate that both architectures achieved exceptionally high accuracy. The principal finding reveals that MobileNetV3-Small unexpectedly outperformed its larger variant, attaining a peak accuracy of 99.5% with the SGD optimizer, a learning rate of 0.01, and a batch size of 32. In comparison, MobileNetV3-Large reached a maximum accuracy of 99.0% under a similar configuration. These findings underscore the considerable potential of lightweight architectures for the development of rapid, accurate, and field-deployable plant disease diagnostic applications on mobile devices using deep learning.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.