Abstract

Convolutional Neural Networks (CNNs) usually require large datasets to be properly trained. Although techniques such as transfer learning can relax those requirements, gathering sufficient labelled data to cover all the variability associated to the problem at hand is often costly and time consuming. A way to minimise this challenge would be gathering the training data under laboratory conditions, using high quality sensors capable of generating images with superior resolution, sharpness and contrast. The downside of this approach is that the resulting dataset will most likely lack the variety that can be found under more realistic conditions. This work investigates this trade-off between image quality and dataset representativeness, that is, if a CNN trained with images captured by a scanner in laboratory would be able to reliably recognise psyllids in smartphone images captured under more realistic conditions. A total of 1276 images were used in the experiments, half acquired using a flatbed scanner and half acquired using two different brands of smartphones. Experiments were carried out using Squeezenet CNNs and a 10-fold cross-validation strategy. Accuracies ranged from less than 70% using only scanned images, to around 90% when only smartphone images were employed, indicating that more realistic conditions are essential to guarantee the robustness of the trained network. Scanned images were useful when the training set containing realistic images was not enough to cover all the variability found in the experiments, but were otherwise innocuous.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.