Abstract

Crop variety yield prediction is important for the global food supply. In recent years, machine learning models have been successfully applied in this domain. However, most seed companies typically have a limited number of trial sites they can set up and lack sufficient data for training machine learning models individually, which prevents them from benefiting from state-of-the-art AI technologies in the age of intelligent breeding. The primary aim of this study is to propose a novel approach based on a federated random forest algorithm that enables breeding institutions to collaborate without disclosing or sharing their own data, and to jointly model with tabular data including field breeding test phenotypes and environmental meteorology data stored locally by each participating institution. With a focus on a real tabular dataset obtained from maize field trials at 248 trial sites from the China National Crop Variety Tests from 2017 to 2021, this paper presents the first results that delve into phenotypic data and explore federated learning algorithms based on decision trees in the field of crop yield prediction. Empirical verification of the maize crop yield prediction scenario showed that the method not only performs better than each of the models trained on an individual data source, but also is virtually lossless in accuracy compared with traditional, data-centralized random forest approach. Additionally, the method provides a cost-effective and efficient alternative for joint breeding for breeders and breeding teams.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.