Forecasting Risk of Crop Disease with Anomaly Detection Algorithms.

Peter Skelsey

doi:10.1094/phyto-05-20-0185-r

Abstract

Information from crop disease surveillance programs and outbreak investigations provides real-world data about the drivers of epidemics. In many cases, however, only information on outbreaks is collected and data from surrounding healthy crops are omitted. Use of such data to develop models that can forecast risk/no risk of disease is therefore problematic, as information relating to the no-risk status of healthy crops is missing. This study explored a novel application of anomaly detection techniques to derive models for forecasting risk of crop disease from data composed of outbreaks only. This was done in two steps. In the training phase, the algorithms were used to learn the envelope of weather conditions most associated with historic crop disease outbreaks. In the testing phase, the algorithms were used for hindcasting of historic outbreak events. Five different anomaly detection algorithms were compared according to their accuracy in forecasting outbreaks: robust covariance, one-class k-means, Gaussian mixture model, kernel density estimation, and one-class support vector machine. A case study of potato late blight survey data from across Great Britain was used for proof of concept. The results showed that Gaussian mixture model had the highest forecast accuracy at 97.0%, followed by one-class k-means at 96.9%. There was added value in combining the algorithms in an ensemble to provide a more accurate and robust forecasting tool that can be tailored to produce region-specific alerts. The techniques used here can easily be applied to outbreak data from other crop pathosystems to derive tools for agricultural decision support.

Full Text