Influence of Image Quality and Light Consistency on the Performance of Convolutional Neural Networks for Weed Mapping

Chengsong Hu,Bishwa B Sapkota,Muthukumar V Bagavathiannan,J Alex Thomasson

doi:10.3390/rs13112140

Chengsong Hu, Bishwa B Sapkota + Show 2 more

Open Access

https://doi.org/10.3390/rs13112140

Copy DOI

Journal: Remote sensing	Publication Date: May 29, 2021
Citations: 19	License type: CC BY 4.0

Affiliation: Texas A&M University, Mississippi State University

Abstract

Recent computer vision techniques based on convolutional neural networks (CNNs) are considered state-of-the-art tools in weed mapping. However, their performance has been shown to be sensitive to image quality degradation. Variation in lighting conditions adds another level of complexity to weed mapping. We focus on determining the influence of image quality and light consistency on the performance of CNNs in weed mapping by simulating the image formation pipeline. Faster Region-based CNN (R-CNN) and Mask R-CNN were used as CNN examples for object detection and instance segmentation, respectively, while semantic segmentation was represented by Deeplab-v3. The degradations simulated in this study included resolution reduction, overexposure, Gaussian blur, motion blur, and noise. The results showed that the CNN performance was most impacted by resolution, regardless of plant size. When the training and testing images had the same quality, Faster R-CNN and Mask R-CNN were moderately tolerant to low levels of overexposure, Gaussian blur, motion blur, and noise. Deeplab-v3, on the other hand, tolerated overexposure, motion blur, and noise at all tested levels. In most cases, quality inconsistency between the training and testing images reduced CNN performance. However, CNN models trained on low-quality images were more tolerant against quality inconsistency than those trained by high-quality images. Light inconsistency also reduced CNN performance. Increasing the diversity of lighting conditions in the training images may alleviate the performance reduction but does not provide the same benefit from the number increase of images with the same lighting condition. These results provide insights into the impact of image quality and light consistency on CNN performance. The quality threshold established in this study can be used to guide the selection of camera parameters in future weed mapping applications.

Highlights

Choosing the right camera parameters and lighting conditions is a critical consideration for researchers and engineers trying to obtain the best possible computer vision result in agricultural applications [1,2,3]
We studied three popular convolutional neural networks (CNNs) frameworks used in weed mapping: object detection, semantic segmentation, and instance segmentation
The overall average precision (AP) for object detection and instance segmentation shows a significant reduction noise but tends to over-smooth the texture of the leaves was very close to the baseline of

Summary

Introduction

Choosing the right camera parameters and lighting conditions is a critical consideration for researchers and engineers trying to obtain the best possible computer vision result in agricultural applications [1,2,3]. Inappropriate selections may lead to unsatisfactory mapping results, and re-collecting images using new settings may be expensive and sometimes impossible. Formation of Digital images are areformed formedfrom fromphotons photons emitted light sources reflected from101. Digital images emitted byby light sources andand reflected from object surfaces. The photons are diffracted through the camera lens and projected object surfaces. The photons are diffracted through the camera lens and projected 102 onto inside the the camera. Values convey information physicalproperties prop- 105 and conditions of theof light and scenes Eachphoton photonproduces producesananelectrical electrical response onto the the detector detector array array inside response at at a103 specific site onon thethe detector and a specific site detectorarray, array, andthe theresulting resultingsignals signalsare aretransformed transformedelectronically electroni- 104 into grida of pixel values.

Objectives

Methods

Results

Conclusion