Abstract
Abstract. With MLAir (Machine Learning on Air data) we created a software environment that simplifies and accelerates the exploration of new machine learning (ML) models, specifically shallow and deep neural networks, for the analysis and forecasting of meteorological and air quality time series. Thereby MLAir is not developed as an abstract workflow, but hand in hand with actual scientific questions. It thus addresses scientists with either a meteorological or an ML background. Due to their relative ease of use and spectacular results in other application areas, neural networks and other ML methods are also gaining enormous momentum in the weather and air quality research communities. Even though there are already many books and tutorials describing how to conduct an ML experiment, there are many stumbling blocks for a newcomer. In contrast, people familiar with ML concepts and technology often have difficulties understanding the nature of atmospheric data. With MLAir we have addressed a number of these pitfalls so that it becomes easier for scientists of both domains to rapidly start off their ML application. MLAir has been developed in such a way that it is easy to use and is designed from the very beginning as a stand-alone, fully functional experiment. Due to its flexible, modular code base, code modifications are easy and personal experiment schedules can be quickly derived. The package also includes a set of validation tools to facilitate the evaluation of ML results using standard meteorological statistics. MLAir can easily be ported onto different computing environments from desktop workstations to high-end supercomputers with or without graphics processing units (GPUs).
Highlights
In times of rising awareness of air quality and climate issues, the investigation of air quality and weather phenomena is moving into focus
We present a new framework to enable fast and flexible Machine Learning on Air data time series (MLAir)
We show how the results of an experiment conducted by MLAir are structured and which statistical analysis is applied
Summary
In times of rising awareness of air quality and climate issues, the investigation of air quality and weather phenomena is moving into focus. Driven in particular by computer vision and speech recognition, technologies like convolutional neural networks (CNNs; Lecun et al, 1998), or recurrent networks variations such as long short-term memory (LSTM; Hochreiter and Schmidhuber, 1997) or gated recurrent units (GRUs; Cho et al, 2014) and more advanced concepts like variational autoencoders (VAEs; Kingma and Welling, 2014; Rezende et al, 2014), or generative adversarial networks (GANs; Goodfellow et al, 2014), are powerful and widely and successfully used The application of such methods to weather and air quality data is rapidly gaining momentum. Data scientists use data to build their models on and evaluate them either with additional independent data or physical constraints This elementary difference can lead to misinterpretation of study results so that, for example, the ability of the network to generalize is misjudged. We would like to mention that MLAir is an open-source project and contributions from all communities are welcome
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.