Abstract

Abstract Data-driven models for the prediction of bluetongue vector distributions are valuable tools for the identification of areas at risk for bluetongue outbreaks. Various models have been developed during the last decade, and the majority of them use linear discriminant analysis or logistic regression to infer vector–environment relationships. This study presents a performance assessment of two established models compared to a distribution model based on a promising ensemble learning technique called Random Forests. Additionally, the impact of false absences, i.e. data records of suitable vector habitat that are, for various reasons, incorrectly labelled as absent, on the model outcome was assessed using alternative calibration–validation schemes. Three reduction methods were applied to reduce the number of false absences in the calibration data, without loss of information on the environmental gradient of suitable vector habitat: random reduction and stratified reduction based on the distance between absence and presence records in geographical (Euclidean distance) or environmental space (Mahalanobis distance). The results indicated that the predicted vector distribution by the Random Forest model was significantly more accurate than the vector distributions predicted by the two established models (McNemar test, p < 0.01) when the calibration data were not reduced with respect to false absences. The performance of the established models, however, increased considerably by application of stratified false absence reductions. Model validation revealed no significant difference between the performance of the three distinct Culicoides imicola distribution models for the majority of alternative stratified reduction schemes. The main conclusion of this study is that the application of Random Forests, or linear discriminant analysis and logistic regression on the condition that calibration data were first reduced on geographical or environmental information, potentially lead toward better vector distribution models.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.