A new evaluation framework for input variable selection algorithms used in environmental modelling

Abstract

Abstract: Input variable selection is an essential step in the development of statistical models and is particularly relevant in environmental modelling, where potential model inputs often consist of time lagged values of each different potential input variable. While new methods for identifying important model inputs continue to emerge, each has its own advantages and limitations and no method is best suited to all datasets and purposes. Nevertheless, rigorous evaluation of new and existing input variable selection methods, is largely neglected due to the lack of guidelines or precedent to facilitate consistent and standardised assessment. This rigorous evaluation would allow the effectiveness of these algorithms to be properly identified in various circumstances. In this paper, we propose a new framework for the evaluation of input variable selection methods which takes into account a wide range of dataset properties that are relevant to real world environmental data and assessment criteria selected to highlight algorithm suitability in different situations of interest. The framework is supported by a repository of datasets to enable standardised and statistically significant testing. It is hoped that this framework helps to promote the appropriate application and comparison of input variable selection algorithms and eventually serves to provide guidance as to which algorithm is most suitable in a given situation.

2 Figures and Tables

Cite this paper

@inproceedings{Ames2014ANE, title={A new evaluation framework for input variable selection algorithms used in environmental modelling}, author={Daniel P. Ames and Nigel W. T. Quinn and Andrea Emilio Rizzoli and Greer B. Humphrey and Stefano Galelli and Andrea Castelletti and Holger R. Maier and Graeme C. Dandy and Matthew S. Gibbs}, year={2014} }