Abstract
Input Variable Selection (IVS) is an essential step in the development of data-driven models and is particularly relevant in environmental modelling. While new methods for identifying important model inputs continue to emerge, each has its own advantages and limitations and no single method is best suited to all datasets and modelling purposes. Rigorous evaluation of new and existing input variable selection methods would allow the effectiveness of these algorithms to be properly identified in various circumstances. However, such evaluations are largely neglected due to the lack of guidelines or precedent to facilitate consistent and standardised assessment. In this paper, a new framework is proposed for the evaluation and inter-comparison of IVS methods which takes into account: (1) a wide range of dataset properties that are relevant to real world environmental data, (2) assessment criteria selected to highlight algorithm suitability in different situations of interest, and (3) a website for sharing data, algorithms and results (http://ivs4em.deib.polimi.it/). The framework is demonstrated on four IVS algorithms commonly used in environmental modelling studies and twenty-six datasets exhibiting different typical properties of environmental data. The main aim at this stage is to demonstrate the application of the proposed evaluation framework, rather than provide a definitive answer as to which of these algorithms has the best overall performance. Nevertheless, the results indicate interesting differences in the algorithms' performance that have not been identified previously.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.